Novel human hydrolase family members and uses thereof

ABSTRACT

The invention provides isolated nucleic acids molecules, designated 26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, and 46508 nucleic acid molecules, which encode novel human hydrolase family members. The invention also provides antisense nucleic acid molecules, recombinant expression vectors containing 26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, or 46508 nucleic acid molecules, host cells into which the expression vectors have been introduced, and nonhuman transgenic animals in which a 26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, or 46508 gene has been introduced or disrupted. The invention still further provides isolated 26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, or 46508 proteins, fusion proteins, antigenic peptides and anti-26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, or 46508 antibodies. Diagnostic methods utilizing compositions of the invention are also provided.

RELATED APPLICATIONS

[0001] This application is a continuation-in-part and claims priority to U.S. application Ser. No. 09/816,664, filed Mar. 23, 2001, which claims the benefit of U.S. Provisional Application Ser. No. 60/191,973, filed Mar. 24, 2000; and U.S. application Ser. No. 09/841,880, filed Apr. 24, 2001, which claims the benefit of U.S. Provisional Application Ser. No. 60/199,559, filed Apr. 25, 2000; and U.S. application Ser. No. 09/862,556, filed May 22, 2001, and International Application Serial No. PCT/US01/16424, filed May 22, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/206,036, filed May 22, 2000; and U.S. application Ser. No. 09/861,165, filed May 18, 2001, and International Application Serial No. PCT/US01/16014, filed May 18, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/205,442, filed May 19, 2000; and U.S. application Ser. No. 09/875,353, filed Jun. 6, 2001, and International Application Serial No. PCT/US01/18335, filed Jun. 6, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/209,949, filed Jun. 6, 2000; and U.S. application Ser. No. 09/896,578, filed Jun. 29, 2001, and International Application Serial No. PCT/US01/20880, filed Jun. 29, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/214,948, filed Jun. 29, 2000; and U.S. application Ser. No. 09/911,150, filed Jul. 23, 2001, and International Application Serial No. PCT/US01/23153, filed Jul. 23, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/220,008, filed Jul. 21, 2000; and U.S. application Ser. No. 09/911,317, filed Jul. 23, 2001, and International Application Serial No. PCT/US01/23160, filed Jul. 23, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/220,040, filed Jul. 21, 2000; and U.S. application Ser. No. 09/934,323, filed Aug. 21, 2001, and International Application Serial No. PCT/US01/26091, filed Aug. 21, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/226,774, filed Aug. 21, 2000; and U.S. application Ser. No. 09/963,959, filed Sep. 25, 2001, and International Application Serial No. PCT/US01/29962, filed Sep. 25, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/235,033, filed Sep. 25, 2000; and U.S. Application Serial No. 09/971,490, filed Oct. 5, 2001, and International Application Serial No. PCT/US01/31674, filed Oct. 5, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/238,170, filed Oct. 5, 2000; and U.S. application Ser. No. 10/071,275, filed Feb. 7, 2002, and International Application Serial No. PCT/US02/03793, filed Feb. 7, 2002, which claim the benefit of U.S. Provisional Application Ser. No. 60/267,054, filed Feb. 7, 2001; and U.S. application Ser. No. 09/888,911, filed Jun. 25, 2001, and International Application Serial No. PCT/US01/19967, filed Jun. 25, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/213,688, filed Jun. 23, 2000, the contents of which are incorporated herein by reference.

BACKGROUND OF THE 26443 AND 46873 INVENTION

[0002] Asparaginase is an enzyme that catalyzes the hydrolysis of asparagine to aspartic acid and ammonia. Saccharomyces cerevisiae expresses two forms of asparaginase: L-asparaginase I, a cytoplasmic enzyme that is synthesized constitutively, and asparaginase II, a cell wall mannan protein localized external to the cell membrane which plays a role in hydrolysis of exogenous asparagines and uptake of aspartic acid. The two enzymes are biochemically and genetically distinct.

[0003] Because some lymphoid tumor cells are deficient in L-asparagine synthetase and cannot synthesize sufficient L-asparagine, asparagine is, for these cells, an essential amino acid. Therefore, asparagine depletion by administration of asparaginase rapidly results in decreased protein synthesis, followed by a decrease in DNA and RNA synthesis, and ultimately cell death.

SUMMARY OF THE 26443 And 46873 INVENTION

[0004] The present invention is based, in part, on the discovery of novel asparaginases, referred to herein as “26443” and “46873” nucleic acid and protein molecules. The nucleotide sequence of a cDNA encoding 26443 and 46873 is shown in SEQ ID NO: 1 and SEQ ID NO:4, respectively, and the amino acid sequence of a 26443 and 46873 polypeptide is shown in SEQ ID NO:2 and SEQ ID NO:5, respectively. In addition, the nucleotide sequence of the coding regions of 26443 and 46873 are depicted in SEQ ID NO:3 and SEQ ID NO:6, respectively.

[0005] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 26443 or 46873 protein or polypeptide, e.g., a biologically active portion of the 26443 or 46873 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:2 or SEQ ID NO:5. In other embodiments, the invention provides isolated 26443 or 46873 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______ or Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______ or Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under stringent hybridization conditions to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______ or Accession Number ______, wherein the nucleic acid encodes a full length 26443 or 46873 protein or a biologically active fragment thereof.

[0006] In a related aspect, the invention further provides nucleic acid constructs, which include a 26443 or 46873 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 26443 or 46873 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 26443 or 46873 nucleic acid molecules and polypeptides.

[0007] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 26443 or 46873-encoding nucleic acids.

[0008] In still another related aspect, isolated nucleic acid molecules that are antisense to a 26443 or 46873 encoding nucleic acid molecule are provided.

[0009] In another aspect, the invention features, 26443 or 46873 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 26443- or 46873-mediated or related disorders. In another embodiment, the invention provides 26443 or 46873 polypeptides having a 26443 or 46873 activity. Preferred polypeptides are 26443 or 46873 proteins including at least one asparaginase domain, and, preferably, having a 26443 or 46873 activity, e.g., a 26443 or 46873 activity as described herein.

[0010] In other embodiments, the invention provides 26443 or 46873 polypeptides, e.g., a 26443 or 46873 polypeptide having the amino acid sequence shown in SEQ ID NO:2 or SEQ ID NO:5, respectively; the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______ or Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:2 or SEQ ID NO:5; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under stringent hybridization conditions to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______ or Accession Number ______, wherein the nucleic acid encodes a full length 26443 or 46873 protein or an active fragment thereof.

[0011] In a related aspect, the invention further provides nucleic acid constructs that include a 26443 or 46873 nucleic acid molecule described herein.

[0012] In a related aspect, the invention provides 26443 or 46873 polypeptides or fragments operatively linked to non-26443 or -46873 polypeptides to form fusion proteins.

[0013] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably, specifically bind 26443 or 46873 polypeptides.

[0014] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 26443 or 46873 polypeptides or nucleic acids.

[0015] In still another aspect, the invention provides a process for modulating 26443 or 46873 polypeptide or nucleic acid expression or activity, e.g., using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 26443 or 46873 polypeptides or nucleic acids, such as metabolic diseases and conditions involving aberrant or deficient oxidation of long- and medium-chain fatty acids.

[0016] The invention also provides assays for determining the activity of, or the presence or absence of, 26443 or 46873 polypeptides or nucleic acid molecules in a biological sample, including for the purpose of disease diagnosis.

[0017] In a further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 26443 or 46873 polypeptide or nucleic acid molecule, including for the purpose of disease diagnosis.

[0018] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 26443 or 46873 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 26443 or 46873 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 26443 or 46873 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[0019] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

[0020]FIGS. 1A-1B depicts a cDNA sequence (SEQ ID NO:1) and predicted amino acid sequence (SEQ ID NO:2) of human 26443. The methionine-initiated open reading frame of human 26443 (without the 5′ and 3′untranslated regions) starts at nucleotide 91 and continues through to nucleotide 1344 of SEQ ID NO: 1 (coding sequence also shown in SEQ ID NO:3).

[0021]FIG. 2 depicts a hydropathy plot of human 26443. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. The cysteine residues (Cys) are indicated by short vertical lines just below the hydropathy trace. The numbers corresponding to the amino acid sequence of human 26443 are indicated. Polypeptides of the invention include 26443 fragments that include: all or part of a hydrophobic sequence (a sequence above the dashed line; all or part of a hydrophilic fragment (e.g., a fragment below the dashed line). Other fragments include a cysteine or a glycosylation site.

[0022]FIG. 3 depicts a series of plots summarizing an analysis of the primary and secondary protein structure of a human asparaginase. The particular algorithm used for each plot is indicated at the right hand side of each plot. The following plots are depicted: Gamier-Robson plots providing the predicted location of alpha-, beta-, turn and coil regions (Gamier et al. (1978) J. Mol. Biol. 120:97); Chou-Fasman plots providing the predicted location of alpha-, beta-, turn and coil regions (Chou and Fasman (1978) Adv. In Enzymol. Mol. 47:45-148); Kyte-Doolittle hydrophilicity/hydrophobicity plots (Kyte and Doolittle (1982) J. Mol. Biol. 157:105-132); Eisenberg plots providing the predicted location of alpha- and beta-amphipathic regions (Eisenberg et al. (1982) Nature 299:371-374); a Karplus-Schultz plot providing the predicted location of flexible regions (Karplus and Schulz (1985) Naturwissens-Chafen 72:212-213); a plot of the antigenic index (Jameson-Wolf) (Jameson and Wolf (1988) CABIOS 4:121-136); and a surface probability plot (Emini algorithm) (Emini et al. (1985) J. Virol. 55:836-839). The numbers corresponding to the amino acid sequence of human 26443 are indicated.

[0023]FIG. 4 depicts an alignment of the asparaginase domain of human 26443 with a consensus amino acid sequence derived from a hidden Markov model. The upper sequence is the consensus amino acid sequence (SEQ ID NO:7), while the lower amino acid sequence corresponds to amino acids 38 to 345 of SEQ ID NO:2.

[0024]FIG. 5 depicts a cDNA sequence (SEQ ID NO:4) and predicted amino acid sequence (SEQ ID NO:5) of human 46873. The methionine-initiated open reading frame of human 46873 (without the 5′ and 3′untranslated regions) starts at nucleotide 134 and continues through to nucleotide 1057 of SEQ ID NO:4 (coding sequence also shown in SEQ ID NO:6).

[0025]FIG. 6 depicts a hydropathy plot of human 46873. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. The cysteine residues (Cys) are indicated by short vertical lines just below the hydropathy trace. The numbers corresponding to the amino acid sequence of human 46873 are indicated. Polypeptides of the invention include 46873 fragments that include: all or part of a hydrophobic sequence (a sequence above the dashed line; all or part of a hydrophilic fragment (e.g., a fragment below the dashed line).

[0026]FIG. 7 depicts a series of plots summarizing an analysis of the primary and secondary protein structure of a human asparaginase. The particular algorithm used for each plot is indicated at the right hand side of each plot. The following plots are depicted: Gamier-Robson plots providing the predicted location of alpha-, beta-, turn and coil regions (Gamier et al. (1978) J. Mol. Biol. 120:97); Chou-Fasman plots providing the predicted location of alpha-, beta-, turn and coil regions (Chou and Fasman (1978) Adv. In Enzymol. Mol. 47:45-148); Kyte-Doolittle hydrophilicity/hydrophobicity plots (Kyte and Doolittle (1982) J. Mol. Biol. 157:105-132); Eisenberg plots providing the predicted location of alpha- and beta-amphipathic regions (Eisenberg et al. (1982) Nature 299:371-374); a Karplus-Schultz plot providing the predicted location of flexible regions (Karplus and Schulz (1985) Naturwissens-Chafen 72:212-213); a plot of the antigenic index (Jameson-Wolf) (Jameson and Wolf (1988) CABIOS 4:121-136); and a surface probability plot (Emini algorithm) (Emini et al. (1985) J. Virol. 55:836-839). The numbers corresponding to the amino acid sequence of human 46873 are indicated.

[0027]FIG. 8 depicts an alignment of the asparaginase domain of human 46873 with a consensus amino acid sequence derived from a hidden Markov model. The upper sequence is the consensus amino acid sequence (SEQ ID NO:7), while the lower amino acid sequence corresponds to amino acids 1 to 302 of SEQ ID NO:5.

[0028]FIG. 9 depicts a hydropathy plot of human 61833. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. Numbers corresponding to positions in the amino acid sequence of human 61833 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 30 to 50, from about 276 to 293, and from about 323 to 341 of SEQ ID NO:11; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 300 to 310 of SEQ ID NO:11.

[0029]FIGS. 10A-10B depicts an alignment of the pyridoxyl-dependent decarboxylase domain of human 61833 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:13), while the lower amino acid sequence corresponds to amino acids 41 to 401 of SEQ ID NO:11.

[0030]FIG. 11 depicts a hydropathy plot of human 26493. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below the hydropathy trace. The numbers corresponding to the amino acid sequence of human 26493 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence of from about amino acid residue 85 to 101, and from about 350 to 360 of SEQ ID NO:17; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence from about amino acid residue 360 to 370 of SEQ ID NO:17; or a sequence which includes a Cys, or an N-glycosylation site.

[0031]FIG. 12 depicts an alignment of the mutT domain of human 26493 with consensus amino acid sequences derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence for a mutT domain (SEQ ID NO:19), while the lower amino acid sequence corresponds to amino acids 122 to 251 of SEQ ID NO: 17.

[0032]FIG. 13 depicts a hydropathy plot of human 58224. The SNF2 domain and the C-terminal helicase domain are indicated. The numbers corresponding to the amino acid sequence of human 58224 (SEQ ID NO:23) are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence of 255-265 of SEQ ID NO:23; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of 150-160 of SEQ ID NO:23; a sequence which includes a Cys, or a glycosylation site.

[0033]FIGS. 14A-14B depicts an alignment of the SN2 N-terminal domain of human 58224 with a consensus amino acid sequence derived from a hidden Markov model. The upper sequence is the consensus amino acid sequence (SEQ ID NO:25), while the lower amino acid sequence corresponds to amino acids 226 to 577 of SEQ ID NO:23.

[0034]FIG. 14C depicts an alignment of the helicase conserved C-terminal domain of human 58224 with a consensus amino acid sequence derived from a hidden Markov model. The upper sequence is the consensus amino acid sequence (SEQ ID NO:26), while the lower amino acid sequence corresponds to amino acids 629 to 712 of SEQ ID NO:23.

[0035]FIG. 15 depicts a hydropathy plot of human 46980. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. Numbers corresponding to positions in the amino acid sequence of human 46980 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 1 to 29, from about 187 to 211, and from about 675 to 696 of SEQ ID NO:28; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 80 to 100, or 440 to 450 of SEQ ID NO:28. Also indicated are an extracellular domain from about amino acid 43 to 674 of SEQ ID NO:28, a transmembrane domain from about amino acid 675 to 696 of SEQ ID NO:28, an intracellular domain from about amino acid 697 to 816 of SEQ ID NO:28, and a carboxylesterase domain from about amino acid 25 to 590 of SEQ ID NO:28.

[0036]FIGS. 16A-16C depicts an alignment of the carboxylesterase domain of human 46980 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:30), while the lower amino acid sequence corresponds to amino acids 25 to 590 of SEQ ID NO:28.

[0037]FIGS. 17A-17B depicts a BLAST alignment of a human 46980 polypeptide with a rat neuroligin 3 (SEQ ID NO:31).

[0038]FIG. 18 depicts a hydropathy plot of human 32225. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. Numbers corresponding to positions in the amino acid sequence of human 32225 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 71 to 88, from about 135 to 157, and from about 186 to 199 of SEQ ID NO:34; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 106 to 123, from about 220 to 236, and from about 299 to 316 of SEQ ID NO:34.

[0039]FIG. 19 depicts an alignment of the hydrolase domain of human 32225 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:36), while the lower amino acid sequence corresponds to amino acids 95 to 338 of SEQ ID NO:34.

[0040]FIGS. 20A-20C depicts BLAST alignments of the α/β hydrolase domain of human 32225 with consensus amino acid sequences derived from ProDom families PD349163, PD021903, and PD034252, (see ProDomain Release 2001.1; http://www.toulouse.inra.fr/prodom.html). The BLAST algorithm identifies multiple local alignments between the consensus amino acid sequence and human 32225. (A) The lower sequence is the consensus amino acid sequence for ProDom PD349163 (SEQ ID NO:37), while the upper amino acid sequence corresponds to a first fragment of the α/β hydrolase domain of human 32225, about amino acids 8 to 69 of SEQ ID NO:34. (B) The lower sequence is the consensus amino acid sequence for ProDom PD021903 (SEQ ID NO:38), while the upper amino acid sequence corresponds to a second fragment of the α/β hydrolase domain of human 32225, about amino acids 187 to 277 of SEQ ID NO:34. (C) The lower sequence is the consensus amino acid sequence for ProDom PD034252 (SEQ ID NO:39), while the upper amino acid sequence corresponds to a third fragment of the α/β hydrolase domain of human 32225, about amino acids 280 to 342 of SEQ ID NO:34.

[0041]FIG. 21 depicts a hydropathy plot of human 47508. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. Numbers corresponding to positions in the amino acid sequence of human 47508 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 151 to 172, from about 215 to 232, and from about 377 to 393 of SEQ ID NO:42; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 19 to 35, from about 245 to 263, and from about 286 to 310 of SEQ ID NO:42.

[0042]FIG. 22 depicts an alignment of the histone deacetylase domain of human 47508 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:44), while the lower amino acid sequence corresponds to amino acids 83 to 392 of SEQ ID NO:42.

[0043]FIGS. 23A-23C depict BLAST alignments of portions of the histone deacetylase domain of human 47508 with representative amino acid sequences derived from ProDomains No. 345193, 001400, and 021448 (ProDomain Release 2000.1; http://www.toulouse.inra.fr/prodom.html). The BLAST algorithm identifies multiple local alignments between the consensus amino acid sequence and human 47508. In FIG. 23A, the lower sequence is the representative amino acid sequence of ProDomain 345193 (SEQ ID NO:45), while the upper amino acid sequence corresponds to an N-terminal portion of the histone deacetylase domain of human 47508, about amino acid residues 71 to 115 of SEQ ID NO:42. In FIG. 23B, the lower sequence is the representative amino acid sequence of ProDomain 001400 (SEQ ID NO:46), while the upper amino acid sequence corresponds to a central portion of the histone deacetylase domain of human 47508, about amino acid residues 120 to 258. In FIG. 23C, the lower sequence is the representative amino acid sequence of ProDomain 021448 (SEQ ID NO:47), while the upper amino acid sequence corresponds to a C-terminal portion of the histone deacetylase domain of human 47508, about amino acid residues 251 to 372 of SEQ ID NO:42.

[0044]FIG. 24 depicts a hydropathy plot of human 56939. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. The numbers corresponding to the amino acid sequence of human 56939 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence of from about amino acid 78 to 83, from about 100 to 104, and from about 223 to 231 of SEQ ID NO:49; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 110 to 115, from about 323 to 330, and from about 339 to 349 of SEQ ID NO:49.

[0045]FIG. 25 depicts an alignment of the acyl-CoA thioesterase domain of human 56939 with a consensus amino acid sequence derived from the ProDomain family PD0006914. The lower sequence is the consensus amino acid sequence (SEQ ID NO:51), while the upper sequence corresponds to amino acids 1 to 415 of SEQ ID NO:49 (SEQ ID NO:52).

[0046]FIG. 26 depicts a hydropathy plot of human 33410. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below the hydropathy trace. The numbers corresponding to the amino acid sequence of human 33410 are indicated. Polypeptides of the invention include 33410 fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 60 to 72, from about 260 to 277, and from about 780 to 793; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 330 to 350, from about 480 to 505, and from about 695 to 720; and/or a sequence which includes a cysteine, or a glycosylation site.

[0047]FIG. 27 depicts an alignment of the carboxylesterase domain of human 33410 with a consensus amino acid sequence derived from a hidden Markov model (HMM). The upper sequence is the consensus amino acid sequence (SEQ ID NO:56), while the lower amino acid sequence corresponds to amino acids 42 to 601 of SEQ ID NO:54.

[0048]FIGS. 28A-28B depicts alignment of the rat neuroligin-2 amino acid sequence and the human 33410 (SEQ ID NO:54) amino acid sequences. The location of the transmembrane domain in the rat neuroligin-2 (SEQ ID NO:57) and 33410 amino acid sequences is indicated as “TM1”.

[0049]FIGS. 29A-29B depicts alignment of the partial human KIAA1366 (Genbank Accession Number AB037787; SEQ ID NO:58) and the human 33410 amino acid sequences (SEQ ID NO:54).

[0050]FIG. 30 depicts a hydropathy plot of human 33521. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below the hydropathy trace. The numbers corresponding to the amino acid sequence of human 33521 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 722 to 730, from about 883 to 891, and from about 966 to 975, of SEQ ID NO:62; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 741 to 750, from about 756 to 762, and from about 1363 to 1372, of SEQ ID NO:62; a sequence which includes a Cys, or a glycosylation site.

[0051]FIG. 31A depicts an alignment of the first PH domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:64), while the lower amino acid sequence corresponds to amino acids 507 to 620 of SEQ ID NO:62.

[0052]FIG. 31B depicts an alignment of the RBD domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:65), while the lower amino acid sequence corresponds to amino acids 810 to 853 of SEQ ID NO:62.

[0053]FIG. 31C depicts an alignment of the PDZ domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:66), while the lower amino acid sequence corresponds to amino acids 890 to 975 of SEQ ID NO:62.

[0054]FIG. 31D depicts an alignment of the Rho GEF domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:67), while the lower amino acid sequence corresponds to amino acids 1103 to 1292 of SEQ ID NO:62.

[0055]FIG. 31E depicts an alignment of the second PH domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:68), while the lower amino acid sequence corresponds to amino acids 1353 to 1455 of SEQ ID NO:62.

[0056]FIG. 32A depicts an alignment of the first PH domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from SMART. The upper sequence is the consensus amino acid sequence (SEQ ID NO:69), while the lower amino acid sequence corresponds to amino acids 507 to 622 of SEQ ID NO:62.

[0057]FIG. 32B depicts an alignment of the RBD domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from SMART. The upper sequence is the consensus amino acid sequence (SEQ ID NO:70), while the lower amino acid sequence corresponds to amino acids 810 to 881 of SEQ ID NO:62.

[0058]FIG. 32C depicts an alignment of the PDZ domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from SMART. The upper sequence is the consensus amino acid sequence (SEQ ID NO:71), while the lower amino acid sequence corresponds to amino acids 900 to 976 of SEQ ID NO:62.

[0059]FIG. 32D depicts an alignment of the Rho GEF domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from SMART. The upper sequence is the consensus amino acid sequence (SEQ ID NO:72), while the lower amino acid sequence corresponds to amino acids 1103 to 1292 of SEQ ID NO:62.

[0060]FIG. 32E depicts an alignment of the second PH domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from SMART. The upper sequence is the consensus amino acid sequence (SEQ ID NO:73), while the lower amino acid sequence corresponds to amino acids 1326 to 1457 of SEQ ID NO:62.

[0061]FIG. 33 depicts a hydropathy plot of human 23479 polypeptide. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. Numbers corresponding to positions in the amino acid sequence of human 23479 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 100 to 110, from about amino acid 295 to 310, and from about amino acid 920 to 930, of SEQ ID NO:75; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence from about amino acid 275 to 290, from about amino acid 530 to 550, and from about amino acid 640 to 650, of SEQ ID NO:75.

[0062]FIG. 34A depicts an alignment of the first ubiquitin carboxyl-terminal hydrolase domain (UCH-1) of human 23479 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:83), while the lower amino acid sequence corresponds to amino acids 296-327 of SEQ ID NO:75.

[0063]FIG. 34B depicts an alignment of the second ubiquitin carboxyl-terminal hydrolase domain (UCH-2) of human 23479 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:84), while the lower amino acid sequence corresponds to amino acids 546-640 of SEQ ID NO:75.

[0064]FIG. 35 depicts a hydropathy plot of human 48120 polypeptide. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. Numbers corresponding to positions in the amino acid sequence of human 48120 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 1040 to 1055 of SEQ ID NO:78; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence from about amino acid 120 to 155, from about amino acid 680 to 700, and from about amino acid 770 to 800, of SEQ ID NO:78.

[0065]FIG. 36A depicts an alignment of the first ubiquitin carboxyl-terminal hydrolase domain (UCH-1) of human 48120 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:83), while the lower amino acid sequence corresponds to amino acids 162 to 193 of SEQ ID NO:78.

[0066]FIG. 36B depicts an alignment of the second ubiquitin carboxyl-terminal hydrolase domain (UCH-2) of human 48120 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:84), while the lower amino acid sequence corresponds to amino acids 580 to 649 of SEQ ID NO:78.

[0067]FIG. 36C depicts an alignment of the ubiquitin associated (UBA) domain of human 48120 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:85), while the lower amino acid sequence corresponds to amino acids 20 to 61 of SEQ ID NO:78.

[0068]FIG. 36D depicts an alignment of the ubiquitin interaction motif (UIM) domain of human 48120 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:86), while the lower amino acid sequence corresponds to amino acids 96 to 113 of SEQ ID NO:78.

[0069]FIG. 37 depicts a hydropathy plot of human 46689 polypeptide. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. Numbers corresponding to positions in the amino acid sequence of human 46689 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 1 to 23, from about 133 to 145, and from about 150 to 168 of SEQ ID NO:81; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 34 to 51, from about 333 to 347, and from about 438 to 449 of SEQ ID NO:81.

[0070]FIG. 38 depicts an alignment of the α/β hydrolase domain of human 46689 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:87), while the lower amino acid sequence corresponds to about amino acid residues 186 to 419 of SEQ ID NO:81.

[0071]FIG. 39 depicts a BLAST alignment of the α/β hydrolase domain of human 46689 with a consensus amino acid sequence derived from a ProDom family PD007763 (Release 2001.1; http://www.toulouse.inra.fr/prodom.html). The lower sequence is the consensus amino acid sequence (SEQ ID NO:88), while the upper amino acid sequence corresponds to the a/b hydrolase domain of human 46689 along with some flanking sequence, about amino acid residues 97 to 424 of SEQ ID NO:81.

[0072]FIG. 40 depicts a hydropathy plot of human 80091. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. The cysteine residues (Cys) are indicated by short vertical lines just below the hydropathy trace. The numbers corresponding to the amino acid sequence of human 80091 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequences of about amino acids 188 to 205, about 540 to 550, about 700 to 725, about 817 to 825, and about 980 to 995 of SEQ ID NO:95; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequences of about amino acids 315 to 339, about 530 to 539, about 680 to 695, and about 1185 to 1220 of SEQ ID NO:95; a sequence which includes a Cys, or a glycosylation site.

[0073]FIG. 41A depicts an alignment of the ubiquitin carboxy-terminal hydrolase-1 (UCH-1) domain of human 80091 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:96), while the lower amino acid sequence corresponds to amino acids 447 to about 478 of SEQ ID NO:95.

[0074]FIG. 41B depicts an alignment of the ubiquitin carboxy-terminal hydrolase-2 (UCH-2) domain of human 80091 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:97), while the lower amino acid sequence corresponds to amino acids 1219 to 1279 of SEQ ID NO:95.

[0075]FIGS. 42A-42C depicts a cDNA sequence (SEQ ID NO:101) and predicted amino acid sequence (SEQ ID NO:102) of human 46508. The coding sequence (without the 5′ and 3′untranslated regions), which starts at the initiator methionine of the open reading frame of human 46508 until the termination codon of SEQ ID NO:101 are also indicated (shown as SEQ ID NO:103).

[0076]FIG. 43 depicts a hydropathy plot of human 46508. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. Numbers corresponding to positions in the amino acid sequence of human 46508 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 60 to 70, from about 86 to 102, and from about 189 to 195 of SEQ ID NO: 102; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 77 to 85, from about 217 to 224 of SEQ ID NO: 102, a sequence which includes a Cys, or a glycosylation site, of SEQ ID NO: 102.

[0077]FIG. 44 depicts an alignment of the peptidyl-tRNA hydrolase domain of human 46508 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO: 104), while the lower amino acid sequence corresponds to amino acids 44 to 221 of SEQ ID NO:102.

DETAILED DESCRIPTION OF 26443 AND 46873

[0078] The human 26443 sequence (FIG. 1; SEQ ID NO: 1), which is approximately 1888 nucleotides long, including untranslated regions, contains a predicted methionine-initiated coding sequence of about 1254 nucleotides (SEQ ID NO:3, and nucleotides 91-1344 of SEQ ID NO:1). The coding sequence encodes an 418 amino acid protein (SEQ ID NO:2).

[0079] Human 26443 contains a predicted asparaginase domain from about amino acids 38 to 345 of SEQ ID NO:2.

[0080] The 26443 protein also includes the following domains: a predicted N-glycosylation site (PFAM Accession PS0001) located at about amino acid residues 225-228 of SEQ ID NO:2; two predicted glycosaminoglycan attachment sites (PFAM Accession PS0002) located at about amino acid residues 7-10 and 289-292 of SEQ ID NO:2; a predicted cAMP- and cGMP-dependent protein kinase phosphorylation site (PFAM Accession PS0004) located at about amino acid residues 217-220 of SEQ ID NO:2; five predicted Protein Kinase C phosphorylation sites (PS00005) at about amino acids 24-26, 33-35, 186-188, 221-223 and 346-348 of SEQ ID NO:2; six predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino acids 6-9, 24-27, 33-36, 116-119, 221-224 and 381-384 of SEQ ID NO:2; and eight predicted N-myristoylation sites (PS00008) from about amino acids 4-9, 77-82, 100-105, 126-131, 228-233, 242-247, 336-341 and 397-402 of SEQ ID NO:2.

[0081] The human 46873 sequence (FIG. 4; SEQ ID NO:4), which is approximately 1358 nucleotides long, including untranslated regions, contains a predicted methionine-initiated coding sequence of about 924 nucleotides (SEQ ID NO:6, and nucleotides 134-1057 of SEQ ID NO:4). The coding sequence encodes a 308 amino acid protein (SEQ ID NO:5).

[0082] Human 46873 contains a predicted asparaginase domain from about amino acids 1 to 302 of SEQ ID NO:5.

[0083] The 46873 protein also includes the following domains: one predicted Protein Kinase C phosphorylation site (PS00005) at about amino acids 141-143 of SEQ ID NO:5; five predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino acids 43-46, 71-74, 80-83, 243-246 and 303-306 of SEQ ID NO:5; and eight predicted N-myristoylation sites (PS00008) from about amino acids 26-31, 50-55, 66-71, 90-05, 156-161, 167-172, 187-192 and 214-219 of SEQ ID NO:5.

[0084] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[0085] Plasmids containing the nucleotide sequence encoding human 26443 and 46873 (clones “Fbh26443FL” and “Fbh46873FL,” respectively) were deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Numbers ______ or ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112. Table 1 contains a summary of sequence information for 26443 and 46873. TABLE 1 (Summary of Sequence Information for Asparaginase Polypeptides ATCC Poly- Accession GENE cDNA ORF peptide FIG No. 26443 SEQ ID SEQ ID SEQ ID 1 — NO: 1 NO: 3 NO: 2 46873 SEQ ID SEQ ID SEQ ID 5 — NO: 4 NO: 6 NO: 5

[0086] The 26443 and 46873 proteins contain a significant number of structural characteristics in common with members of the asparaginase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[0087] 26443 and 46873 polypeptides or 26443 and 46873 family members can include an “asparaginase domain” or regions homologous with an “asparaginase domain”.

[0088] As used herein, the term “asparaginase domain” refers to a protein domain having an amino acid sequence of about 50 to 600 amino acids, preferably about 150 to 450 amino acid residues, more preferably about 300 to 310 amino acids. An asparaginase domain typically includes two conserved threonine residues that play a role in the catalytic properties of asparaginases. The first is typically located in the N-terminal extremity of the protein, while the second is located at the end of the first third of the amino acid sequence. Consensus patterns for asparaginases are as follows: [LIVM]-x(2)-T-G-G-T-[IV]-[AGS], SEQ ID NO:8, the second T is an active site residue, and G-x-[LIVM]-x(2)-H-G-T-D-T-[LIVM], SEQ ID NO:9, wherein the first T is an active site residue. Preferably, an “asparaginase domain” includes an amino acid sequence of about 250 to 400 amino acid residues in length and having a bit score for the alignment of the sequence to the asparaginase domain (HMM) of at least 75. More preferably, an asparaginase domain includes at least about 50 to 600 amino acids, even more preferably about 150 to 400 amino acids, or even most preferably, 300-310 amino acids, and has a bit score for the alignment of the sequence to the asparaginase domain (HMM) of at least 75, 100, 200, 300, 400 or greater. Asparaginase domains (HMM) have been assigned PFAM Accession PF00710 and PFAM Accession PF01112 (http://genome.wustl.edu/Pfam/html). An alignment of the asparaginase domain (SEQ ID NO:7, corresponding to amino acids 38 to 345 of SEQ ID NO:2) of human 26443 with a consensus amino acid sequence derived from a hidden Markov model is depicted in FIG. 4. An alignment of the asparaginase domain (SEQ ID NO:7, corresponding to amino acids 1 to 302 of SEQ ID NO:5) of human 46873 with a consensus amino acid sequence derived from a hidden Markov model is depicted in FIG. 8.

[0089] In a preferred embodiment, A 26443 or 46873 polypeptide or protein has an “asparaginase domain” or a region which includes at least about 50-600, more preferably about 150-450 or 300-310 amino acid residues, and having at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with an “asparaginase domain,” e.g., the asparaginase domain of human 26443 or 46873 (e.g., residues 38-345 of SEQ ID NO:2 or residues 1-302 of SEQ ID NO:5, respectively).

[0090] To identify the presence of a “asparaginase domain” in a 26443 or 46873 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of an “asparaginase domain” in the amino acid sequence of human 26443 and 46873 at about residues 38-345 of SEQ ID NO:2 (see FIG. 4) and 1-302 of SEQ ID NO:5 (see FIG. 8), respectively.

[0091] As the 26443 or 46873 polypeptides of the invention may modulate 26443- or 46873-mediated activities, they may be useful as, or for, developing novel diagnostic and therapeutic agents for 26443- or 46873-mediated or related disorders, as described below.

[0092] As used herein, a “26443 or 46873 activity”, “biological activity of 26443 or 46873” or “functional activity of 26443 or 46873”, refers to an activity exerted by a 26443 or 46873 protein, polypeptide or nucleic acid molecule on, e.g., a 26443- or 46873-responsive cell or on a 26443 or 46873 substrate, e.g., a protein substrate, as determined in vivo or in vitro. In one embodiment, a 26443 or 46873 activity is a direct activity, such as an association with a 26443 or 46873 target molecule. A “target molecule” or “binding partner” is a molecule with which a 26443 or 46873 protein binds or interacts in nature. In an exemplary embodiment, a “target molecule” is, e.g., an asparagine. A 26443 or 46873 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 26443 or 46873 protein with a 26443 or 46873 ligand. For example, the 26443 or 46873 proteins of the present invention can have one or more of the following activities: (1) catalyzes the hydrolysis of asparagine to aspartic acid and ammonia; (2) regulates cellular amounts of asparagine; (3) regulates the cellular amounts of aspartic acid; (4) regulates cellular amounts of ammonia; and (5) antagonizes or inhibits, e.g., competitively or noncompetitively, any of activities 1-4.

[0093] Based on the above-described sequence similarities, the 26443 or 46873 molecules of the present invention are predicted to have similar biological activities as asparaginase family members. Asparaginase enzymes assist in the hydrolysis of asparagine to aspartic acid and ammonia. Thus, the 26443 or 46873 molecules can act as novel diagnostic targets and therapeutic agents for controlling, e.g., the amount of asparagine (and likewise, aspartic acid) in a cell.

[0094] The 26443 or 46873 protein may be involved in disorders characterized by aberrant activity of the cells in which it is expressed. Since asparaginase enzymes are typically found in most cells in bacterial fungi, plants and mammals, e.g., cells that contain or metabolize asparagine, it is likely that 26443 or 46873 proteins may also be expressed in such cells. Therefore, altered expression and/or activity of a 26443 or 46873 molecule can lead to defects in the metabolism of asparagine and/or aspartic acid.

[0095] The 26443 or 46873 molecules can also act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, disorders associated with bone metabolism, immune disorders, hematopoietic disorders, cardiovascular disorders, liver disorders, viral diseases, pain or metabolic disorders.

[0096] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[0097] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth, i.e., an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[0098] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[0099] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[0100] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[0101] The 26443 or 46873 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin, e.g., arising from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[0102] Asparaginases are generally more effective in treating acute lymphoblastic leukemia and lymphosarcomas, than other forms of leukemia or solid tumors, since remissions of these types of cancers are invariably of short duration. Whereas most normal tissues synthesize L-asparagine in amounts sufficient for their metabolic needs, certain neoplastic tissues, primarily acute lymphoblastic leukemia (ALL) and lymphosarcoma cells, require an exogenous source of asparagines (i.e., from nearby host tissues). Administration of L-asparaginase enzymatically catalyzes the hydrolysis of asparagine to aspartic acid and ammonia, which deprives the malignant cells of the asparagine from extracellular fluid and eventually results in cell death. Clinical use of asparaginase from, e.g., Escherichia coli or Erwinia chrysanthemi, oftentimes results in hypersensitive immune responses after multiple administrations. Since the two asparaginase enzymes from E. coli and E. chrysanthemi do not exhibit any cross-reactivity, the two enzymes can be used in a treatment regimen to reduce or avoid the hypersensitivity response.

[0103] Additionally, asparaginases can be administered in combination with other traditional or experimental cancer treatments. Asparaginases can be combined with a treatment modality which inhibits cell proliferation, e.g., cytotoxic agents, e.g., agents with diverse structures and mechanisms of action, including but not limited to, antimicrotubule agents, topoisomerase I inhibitors, topoisomerase II inhibitors, antimetabolites, mitotic inhibitors, alkylating agents, intercalating agents, agents capable of interfering with a signal transduction pathway (e.g., protein kinase C inhibitors, e.g., anti-hormones, e.g., antibodies against growth factor receptors), agents that promote apoptosis and/or necrosis, biological response modifiers (e.g., interferons, interleukins, tumor necrosis factors), and radiation.

[0104] The 26443 or 46873 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:2 or SEQ ID NO:5, respectively, are collectively referred to as “polypeptides or proteins of the invention” or “26443 or 46873 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “26443 or 46873 nucleic acids”. 26443 or 46873 molecules refer to 26443 or 46873 nucleic acids, polypeptides, and antibodies.

[0105] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA) and RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA generated, e.g., by the use of nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[0106] The term “isolated or purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules that are present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules that are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences that naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[0107] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and non-aqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified. Preferably, an isolated nucleic acid molecule of the invention that hybridizes under stringent conditions to the sequence of SEQ ID NO: 1 or 3, corresponds to a naturally-occurring nucleic acid molecule.

[0108] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature (e.g., encodes a natural protein).

[0109] As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include an open reading frame encoding a 26443 or 46873 protein, preferably a mammalian 26443 or 46873 protein, and can further include non-coding regulatory sequences and introns.

[0110] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. In one embodiment, the language “substantially free” means preparation of 26443 or 46873 protein having less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-26443 or -46873 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-26443 or -46873 chemicals. When the 26443 or 46873 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[0111] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 26443 or 46873 (e.g., the nucleotide sequence of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or Accession Number ______) without abolishing or more preferably, without substantially altering a biological activity, whereas an “essential” amino acid residue results in such a change. For example, amino acid residues that are conserved among the polypeptides of the present invention, e.g., those present in the asparaginase domain, are predicted to be particularly unamenable to alteration.

[0112] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 26443 or 46873 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 26443 or 46873 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 26443 or 46873 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or as Accession Number ______, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[0113] As used herein, a “biologically active portion” of a 26443 or 46873 protein includes a fragment of a 26443 or 46873 protein that participates in an interaction between a 26443 or 46873 molecule and a non-26443 or -46873 molecule. Biologically active portions of a 26443 or 46873 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 26443 or 46873 protein, e.g., the amino acid sequence shown in SEQ ID NO:2 or SEQ ID NO:5, respectively, which include less amino acids than the full length 26443 or 46873 proteins, and exhibit at least one activity of a 26443 or 46873 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 26443 or 46873 protein, e.g., asparaginase. A biologically active portion of a 26443 or 46873 protein can be a polypeptide that is, for example, 50, 100, 200 or more amino acids in length. Biologically active portions of a 26443 or 46873 protein can be used as targets for developing agents, which modulate a 26443- or 46873-mediated activity, e.g., asparaginase.

[0114] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[0115] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, even more preferably at least 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence (e.g., when aligning a second sequence to the 26443 amino acid sequence of SEQ ID NO:2 having 125 amino acid residues, at least 167, preferably at least 209, more preferably at least 251, and even more preferably at least 293, 334, 376 or 418 amino acid residues are aligned; when aligning a second sequence to the 46873 amino acid sequence of SEQ ID NO:5 having 92 amino acid residues, at least 123, preferably at least 154, more preferably at least 185, and even more preferably at least 216, 246, 277 or 308 amino acid residues are aligned). The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”). The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[0116] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch (J. Mol. Biol. (48):444-453 (1970)) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used if the practitioner is uncertain about what parameters should be applied to determine if the molecule is within the sequence identity limits of a claim) is using a Blossum 62 scoring matrix with a gap open penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[0117] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller (CABIOS, 4:11-17 (1989)) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[0118] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 26443 or 46873 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 26443 or 46873 protein molecules of the invention. To obtain-gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25(17):3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[0119] “Misexpression or aberrant expression”, as used herein, refers to a non-wild type pattern of gene expression, at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of decreased expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[0120] “Subject”, as used herein, can refer to a mammal, e.g., a human, or to an experimental or animal or disease model. The subject can also be a non-human animal, e.g., a horse, cow, goat, or other domestic animal.

[0121] A “purified preparation of cells”, as used herein, refers to, in the case of plant or animal cells, an in vitro preparation of cells and not an entire intact plant or animal. In the case of cultured cells or microbial cells, it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[0122] Various aspects of the invention are described in further detail below.

[0123] Isolated Nucleic Acid Molecules of 26443 and 46873

[0124] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 26443 or 46873 polypeptide described herein, e.g., a full-length 26443 or 46873 protein or a fragment thereof, e.g., a biologically active portion of a 26443 or 46873 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to a identify a nucleic acid molecule encoding a polypeptide of the invention, 26443 or 46873 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[0125] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO: 1, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or Accession Number ______, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 26443 protein (i.e., “the coding region”, from nucleotides 91-1344 of SEQ ID NO:1), as well as 5′untranslated sequences (nucleotides 1-90 of SEQ ID NO:1) and 3′untranslated sequences (nucleotides 1345-1888 of SEQ ID NO:1). Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:1 (e.g., nucleotides 91-1344, corresponding to SEQ ID NO:3) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to the mature protein from about amino acid 1 to amino acid 418 of SEQ ID NO:2.

[0126] In another embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:4, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or Accession Number ______, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 46873 protein (i.e., “the coding region”, from nucleotides 134-1057 of SEQ ID NO:4), as well as 5′untranslated sequences (nucleotides 1-133 of SEQ ID NO:4) and 3′untranslated sequences (nucleotides 1058-1358 of SEQ ID NO:4). Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:4 (e.g., nucleotides 134-1057, corresponding to SEQ ID NO:6) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to the mature protein from about amino acid 1 to amino acid 308 of SEQ ID NO:5.

[0127] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or as Accession Number ______, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or as Accession Number ______ such that it can hybridize to the nucleotide sequence shown in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or as Accession Number ______, thereby forming a stable duplex.

[0128] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the entire length of the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or as Accession Number ______, or a portion, preferably of the same length, of any of these nucleotide sequences.

[0129] 26443 or 46873 Nucleic Acid Fragments

[0130] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or as Accession Number ______. For example, such a nucleic acid molecule can include a fragment that can be used as a probe or primer or a fragment encoding a portion of a 26443 or 46873 protein, e.g., an immunogenic or biologically active portion of a 26443 or 46873 protein. A fragment can comprise nucleotides 202 to 1125 of SEQ ID NO:1, which encodes an asparaginase domain of human 26443, or nucleotides 134 to 1039 of SEQ ID NO:4, which also encodes an asparaginase domain of human 46873. The nucleotide sequence determined from the cloning of the 26443 or 46873 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 26443 or 46873 family members, or fragments thereof, as well as 26443 or 46873 homologues, or fragments thereof, from other species.

[0131] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment that includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof that are at least 200, preferably 300 amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[0132] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a nucleic acid fragment can include a sequence corresponding to an asparaginase domain.

[0133] In a preferred embodiment, the fragment is at least 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, or 1000 nucleotides in length.

[0134] 26443 or 46873 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under stringent conditions to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or as Accession Number ______, or of a naturally occurring allelic variant or mutant of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or as Accession Number ______.

[0135] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0136] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes an asparaginase domain (corresponding to residues 38-345 of SEQ ID NO:2 or residues 1-302 of SEQ ID NO:5.

[0137] In another embodiment, a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 26443 or 46873 sequence. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. E.g., primers suitable for amplifying all or a portion of a domain or region described herein, e.g., any of the following regions, are provided an asparaginase domain corresponding to residues 38-345 of SEQ ID NO:2 or residues 1-302 of SEQ ID NO:5.

[0138] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[0139] A nucleic acid fragment encoding a “biologically active portion of a 26443 or 46873 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or Accession Number ______, which encodes a polypeptide having a 26443 or 46873 biological activity (e.g., the biological activities of the 26443 or 46873 proteins described herein), expressing the encoded portion of the 26443 or 46873 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 26443 or 46873 protein. For example, a nucleic acid fragment encoding a biologically active portion of 26443 or 46873 includes an asparaginase domain, e.g., amino acid residues 38 to 345 of SEQ ID NO:2 or amino acid residues 1 to 302 of SEQ ID NO:5. A nucleic acid fragment encoding a biologically active portion of a 26443 or 46873 polypeptide may comprise a nucleotide sequence that is greater than 300 or more nucleotides in length (e.g., greater than about 400 nucleotides in length).

[0140] In preferred embodiments, a nucleic acid fragment of 26443 includes a nucleotide sequence which is at least about 300, at least about 353 (e.g., 355, 375, 400), at least about 400 (e.g., 500, 600, 700, 800), at least about 457 (e.g., 460, 500, 600, 700), or more nucleotides in length and hybridizes under stringent hybridization conditions to a nucleic acid molecule of SEQ ID NO: 1, or SEQ ID NO:3, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______.

[0141] In a preferred embodiment, a nucleic acid fragment of 26443 includes a nucleotide sequence comprising nucleotides 183-842, 459-842, 1195-1244, or 1644-1888 of SEQ ID NO: 1, or a portion thereof, wherein each fragment hybridizes under stringent hybridization conditions to a nucleic acid molecule of SEQ ID NO: 1, or SEQ ID NO:3, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______. In another preferred embodiment, a nucleic acid fragment of 26443 includes a nucleotide sequence comprising nucleotides 1-842 of SEQ ID NO: 1, or a portion thereof, wherein each portion is about 183 or longer nucleotides and hybridizes under stringent hybridization conditions to a nucleic acid molecule of SEQ ID NO: 1, or SEQ ID NO:3, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______.

[0142] In a preferred embodiment, a nucleic acid fragment has a nucleotide sequence other than AI793006, AA262517, R89654, or C07777.

[0143] In preferred embodiments, a nucleic acid fragment of 46873 includes a nucleotide sequence which is at least about 300, 400, 500, 560 (e.g., 570, 580, 590, 600), at least about 662 (e.g., 665, 666, 667, 668, 670, 680, 690, 700), or more nucleotides in length and hybridizes under stringent hybridization conditions to a nucleic acid molecule of SEQ ID NO:4, or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______.

[0144] In a preferred embodiment, a nucleic acid fragment of 46873 includes a nucleotide sequence of SEQ ID NO:4 or 6, or a portion thereof; or a portion of the 46873 sequence comprising nucleotides 1-680, 1-686, 1-692 or 1-785 of SEQ ID NO:4, or a portion thereof, wherein each fragment hybridizes under stringent hybridization conditions to a nucleic acid molecule of SEQ ID NO:4, or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______.

[0145] In a preferred embodiment, a nucleic acid fragment has a nucleotide sequence other than AI879995, AI928914, AW131805, or AI978667.

[0146] 26443 or 46873 Nucleic Acid Variants

[0147] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or Accession Number ______. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid that encodes the same 26443 or 46873 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:2 or SEQ ID NO:5. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0148] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one colon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. Coli, yeast, human, insect, or CHO cells.

[0149] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non-naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[0150] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the sequence in ATCC Accession Number ______ or Accession Number ______, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 2%, 5%, 10% or 20% of the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0151] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the amino acid sequence shown in SEQ ID NO:2 or SEQ ID NO:5 or a fragment of those sequences. Nucleic acid molecules encoding such polypeptides can readily be identified as being able to hybridize under stringent conditions, to the nucleotide sequence shown in SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 26443 or 46873 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 26443 or 46873 gene.

[0152] Preferred variants include those that are correlated with asparaginase activity.

[0153] Allelic variants of 26443 or 46873, e.g., human 26443 or 46873, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 26443 or 46873 protein within a population that maintain the ability to function as an asparaginase. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:2 or SEQ ID NO:5, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 26443 or 46873, e.g., human 26443 or 46873, protein within a population that do not have the ability to function as an asparaginase. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:2 or SEQ ID NO:5, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[0154] Moreover, nucleic acid molecules encoding other 26443 or 46873 family members and, thus, which have a nucleotide sequence which differs from the 26443 or 46873 sequences of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or Accession Number ______ are intended to be within the scope of the invention.

[0155] Antisense Nucleic Acid Molecules, Ribozymes and Modified 26443 or 46873 Nucleic Acid Molecules

[0156] In another aspect, the invention features, an isolated nucleic acid molecule that is antisense to 26443 or 46873. An “antisense” nucleic acid can include a nucleotide sequence that is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 26443 or 46873 coding strand, or to only a portion thereof (e.g., the coding region of human 26443 or 46873 corresponding to SEQ ID NO:3 or SEQ ID NO:6, respectively). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 26443 or 46873 (e.g., the 5′ or 3′untranslated regions).

[0157] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 26443 or 46873 mRNA, but more preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding region of 26443 or 46873 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 26443 or 46873 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[0158] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine-substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[0159] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 26443 or 46873 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[0160] In yet another embodiment, the antisense nucleic acid molecule of the invention is an α-anomeric nucleic acid molecule. An α-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual β-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[0161] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 26443- or 46873-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 26443 or 46873 cDNA disclosed herein (i.e., SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 26443- or 46873-encoding mRNA. See, e.g., Cech et al., U.S. Pat. No. 4,987,071; and Cech et al., U.S. Pat. No. 5,116,742. Alternatively, 26443 or 46873 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[0162] 26443 or 46873 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 26443 or 46873 (e.g., the 26443 or 46873 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 26443 or 46873 genes in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6(6):569-84; Helene, C. et al. (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14(12):807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[0163] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or colorimetric.

[0164] A 26443 or 46873 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4 (1): 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra; Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[0165] PNAs of 26443 or 46873 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 26443 or 46873 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[0166] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (See, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (See, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization-triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[0167] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region that is complementary to a 26443 or 46873 nucleic acid of the invention. One complementary region has a fluorophore, and the other, a quencher, such that the molecular beacon is useful for quantitating the presence of the 26443 or 46873 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[0168] Isolated 26443 or 46873 Polypeptides

[0169] In another aspect, the invention features, an isolated 26443 or 46873 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-26443 or -46873 antibodies. 26443 or 46873 protein can be isolated from cells or tissue sources using standard protein purification techniques. 26443 or 46873 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[0170] Polypeptides of the invention include those that arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications when expressed in a native cell, e.g., glycosylation or cleavage.

[0171] In a preferred embodiment, a 26443 or 46873 polypeptide has one or more of the following characteristics:

[0172] (i) it has the ability to catalyze the hydrolysis of asparagine to aspartic acid and ammonia;

[0173] (ii) it has the ability to regulate the cellular levels of asparagine, aspartic acid and ammonia;

[0174] (iii) it has the ability to inhibit or decrease the availability of asparagine in tumors;

[0175] (iv) it has a molecular weight (e.g., deduced molecular weight), amino acid composition or other physical characteristic of a 26443 or a 46873 polypeptide, e.g., a polypeptide having a sequence shown in SEQ ID NO:2 or SEQ ID NO:5;

[0176] (v) it has an overall sequence similarity of at least 60%, preferably at least 70%, more preferably at least 80, 90, or 95%, with a polypeptide of SEQ ID NO:2 or SEQ ID NO:5;

[0177] (vi) it has a asparaginase domain which is preferably about 60%, 70%, 80%, 90% or 95% homologous to amino acid residues 38-345 of SEQ ID NO:2 or residues 1-302 o SEQ ID NO:5; or

[0178] (vii) it has at least 70%, preferably 80%, more preferably 90%, and most preferably 100% of the cysteines found in the amino acid sequence of the native protein.

[0179] In a preferred embodiment, the 26443 or 46873 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:2 or SEQ ID NO:5, respectively. In one embodiment, the protein differs by at least one, but by less than 15, 10 or 5 amino acid residues. In another, it differs from the corresponding sequence in SEQ ID NO:2 or SEQ ID NO:5 by at least one residue but less than 20%, 15%, 10% or 5% of the residues. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non-essential residue or a conservative substitution. In a preferred embodiment the differences are not in the asparaginase domain. In another preferred embodiment one or more differences are in the asparaginase domain.

[0180] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 26443 or 46873 proteins differ in amino acid sequence from SEQ ID NO:2 or SEQ ID NO:5, respectively, yet retain biological activity.

[0181] In one embodiment, the protein includes an amino acid sequence that is at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:2 or SEQ ID NO:5.

[0182] A 26443 or 46873 protein or fragment is provided which varies from the sequence of SEQ ID NO:2 or SEQ ID NO:5, respectively, in non-essential regions (e.g., transmembrane domains) by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment, but which does not differ from SEQ ID NO:2 or SEQ ID NO:5 in catalytic regions (e.g., the asparaginase domain). (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments, the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[0183] In one embodiment, a biologically active portion of a 26443 or 46873 protein includes an asparaginase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 26443 or 46873 protein.

[0184] Particularly preferred 26443 or 46873 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:2 or 5. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:2 or 5 are termed substantially identical.

[0185] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO: 1, 3, 4 or 6 are termed substantially identical.

[0186] 26443 or 46873 Chimeric or Fusion Proteins

[0187] In another aspect, the invention provides 26443 or 46873 chimeric or fusion proteins. As used herein, a 26443 or 46873 “chimeric protein” or “fusion protein” includes a 26443 or 46873 polypeptide linked to a non-26443 or -46873 polypeptide. A “non-26443 or -46873 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 26443 or 46873 protein, e.g., a protein which is different from the 26443 or 46873 protein and which is derived from the same or a different organism. The 26443 or 46873 polypeptide of the fusion protein can correspond to all or a portion, e.g., a fragment described herein of a 26443 or 46873 amino acid sequence. In a preferred embodiment, a 26443 or 46873 fusion protein includes at least one (or two) biologically active portion of a 26443 or 46873 protein. The non-26443 or -46873 polypeptide can be fused to the N-terminus or C-terminus of the 26443 or 46873 polypeptide.

[0188] The fusion protein can include a moiety that has a high affinity for a ligand. For example, the fusion protein can be a GST-26443 or -46873 fusion protein in which the 26443 or 46873 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 26443 or 46873. Alternatively, the fusion protein can be a 26443 or 46873 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 26443 or 46873 can be increased through use of a heterologous signal sequence.

[0189] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[0190] The 26443 or 46873 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 26443 or 46873 fusion proteins can be used to affect the bioavailability of a 26443 or 46873 substrate. 26443 or 46873 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 26443 or 46873 protein; (ii) mis-regulation of the 26443 or 46873 gene; and (iii) aberrant post-translational modification of a 26443 or 46873 protein.

[0191] Moreover, the 26443- or 46873-fusion proteins of the invention can be used as immunogens to produce anti-26443 or -46873 antibodies in a subject, to purify 26443 or 46873 ligands and in screening assays to identify molecules that inhibit the interaction of 26443 or 46873 with a 26443 or 46873 substrate.

[0192] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 26443- or 46873-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 26443 or 46873 protein.

[0193] Variants of 26443 or 46873 Proteins

[0194] In another aspect, the invention also features a variant of a 26443 or 46873 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 26443 or 46873 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 26443 or 46873 protein. An agonist of the 26443 or 46873 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 26443 or 46873 protein. An antagonist of a 26443 or 46873 protein can inhibit one or more of the activities of the naturally occurring form of the 26443 or 46873 protein by, for example, competitively modulating a 26443- or 46873-mediated activity of a 26443 or 46873 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 26443 or 46873 protein.

[0195] Variants of a 26443 or 46873 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 26443 or 46873 protein for agonist or antagonist activity.

[0196] Libraries of fragments, e.g., N-terminal, C-terminal, or internal fragments, of a 26443 or 46873 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 26443 or 46873 protein.

[0197] Variants in which a cysteine residue is added or deleted or in which a residue that is glycosylated is added or deleted are particularly preferred.

[0198] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 26443 or 46873 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6(3):327-331).

[0199] Cell based assays can be exploited to analyze a variegated 26443 or 46873 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 26443 or 46873 in a substrate-dependent manner. The transfected cells are then contacted with 26443 or 46873 and the effect of the expression of the mutant on the 26443 or 46873 substrate can be detected, e.g., by measuring fatty the amount of asparagine and/or aspartic acid and ammonia. Plasmid DNA can then be recovered from the cells that score for inhibition, or alternatively, potentiation of the mutant by the 26443 or 46873 substrate, and the individual clones further characterized.

[0200] In another aspect, the invention features a method of making a 26443 or 46873 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 26443 or 46873 polypeptide, e.g., a naturally occurring 26443 or 46873 polypeptide. The method includes: altering the sequence of a 26443 or 46873 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[0201] In another aspect, the invention features a method of making a fragment or analog of a 26443 or 46873 polypeptide a biological activity of a naturally occurring 26443 or 46873 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 26443 or 46873 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[0202] Anti-26443 or -46873 Antibodies

[0203] In another aspect, the invention provides an anti-26443 or -46873 antibody. The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. Examples of immunologically active portions of immunoglobulin molecules include F(ab) and F(ab′)₂ fragments which can be generated by treating the antibody with an enzyme such as pepsin.

[0204] The antibody can be a polyclonal, monoclonal, recombinant, e.g., a chimeric or humanized, fully human, non-human, e.g., murine, or single chain antibody. In a preferred embodiment it has effector function and can fix complement. The antibody can be coupled to a toxin or imaging agent.

[0205] In a preferred embodiment, the antibody fails to bind an Fc receptor, e.g., it is an isotype which does not bind to an Fc receptor, or has been modified, e.g., by deletion or other mutation, such that it does not have a functional Fc receptor binding region.

[0206] A full-length 26443 or 46873 protein, or an antigenic peptide fragment of 26443 or 46873 can be used as an immunogen or can be used to identify anti-26443 or -46873 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 26443 or 46873 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:2 or SEQ ID NO:5, respectively, and encompasses an epitope of 26443 or 46873. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[0207] Fragments of 26443 or 46873 which include, for example, residues 22-40, 48-65, 193-202, 203-230 or 345-361 of SEQ ID NO:2 or residues 12-26, 55-65 or 139-170 of SEQ ID NO:5, respectively, can be used to make, e.g., antibodies against hydrophilic regions of the 26443 or 46873 protein or used as immunogens or to characterize the specificity of an antibody. Similarly, a fragment of 26443 or 46873 which include, for example, residues 40-50, 75-90, 230-241 or 337-349 of SEQ ID NO:2 or residues 43-55, 90-105 or 138-170 of SEQ ID NO:5, respectively, can be used to make an antibody against a hydrophobic region of the 26443 or 46873 protein; a fragment of 26443 or 46873 which include residues 1-100, 50-150, 100-200, 150-250, 200-300, 250-350, 300-400 or 350-418 of SEQ ID NO:2 or residues 1-100, 50-150, 100-200, 150-250, 200-300 or 250-308 of SEQ ID NO:5, respectively, can be used to make an antibody against a non-transmembrane (i.e., matrix, cytosolic or lumen) region of the 26443 or 46873 protein; and a fragment of 26443 or 46873 which include residues 38-345 of SEQ ID NO:2 or residues 1-302 of SEQ ID NO:5, respectively, can be used to make an antibody against the asparaginase domain of the 26443 or 46873 protein.

[0208] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[0209] Preferred epitopes encompassed by the antigenic peptide are regions of 26443 or 46873 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 26443 or 46873 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 26443 or 46873 protein and are thus likely to constitute surface residues useful for targeting antibody production. For example, residues 25-45, 55-65, 190-205 and 212-225 of the 26443 protein and residues 15-25 and 140-160 of the 46873 protein have a high probability of being localized on the surface of the respective proteins based on an Emini surface probability plot.

[0210] In a preferred embodiment, the antibody can bind to a 26443 or 46873 protein intracellularly. In another embodiment, the antibody binds to a 26443 or 46873 protein extracellularly.

[0211] In a preferred embodiment the antibody binds an epitope on any domain or region on 26443 or 46873 proteins described herein.

[0212] In preferred embodiments an antibody can be made by immunizing with purified 26443 or 46873 antigen, or a fragment thereof, e.g., a fragment described herein, membrane associated antigen, tissue, e.g., crude tissue preparations, whole cells, preferably living cells, lysed cells, or cell fractions, e.g., membrane fractions, cytoplasmic fractions.

[0213] Antibodies that bind only native 26443 or 46873 protein, only denatured or otherwise non-native 26443 or 46873 protein, or that bind both, are within the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be determined by identifying antibodies that bind to native but not denatured 26443 or 46873 proteins.

[0214] Chimeric, humanized, but most preferably, completely human antibodies are desirable for applications which include repeated administration, e.g., therapeutic treatment (and some diagnostic applications) of human patients.

[0215] The anti-26443 or -46873 antibody can be a single-chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D., et al. Ann N Y Acad Sci 1999 Jun. 30;880:263-80; and Reiter, Y. Clin Cancer Res 1996 February;2(2):245-52). The single-chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 26443 or 46873 protein.

[0216] An anti-26443 or -46873 antibody (e.g., monoclonal antibody) can be used to isolate 26443 or 46873 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-26443 or -46873 antibody can be used to detect 26443 or 46873 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-26443 or -46873 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to, for example, determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labeling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or ³H.

[0217] The invention also includes a nucleic acid that encodes an anti-26443 or -46873 antibody, e.g., an anti-26443 or -46873 antibody described herein. Also included are vectors which include the nucleic acid and cells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[0218] The invention also includes cell lines, e.g., hybridomas, which make an anti-26443 or -46873 antibody, e.g., and antibody described herein, and method of using said cells to make an anti-26443 or -46873 antibody.

[0219] 26443 and 46873 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[0220] In another aspect, the invention includes vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[0221] A vector can include a 26443 or 46873 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those that direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 26443 or 46873 proteins, mutant forms of 26443 or 46873 proteins, fusion proteins, and the like).

[0222] The recombinant expression vectors of the invention can be designed for expression of 26443 or 46873 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990). Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[0223] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[0224] Purified fusion proteins can be used in 26443 or 46873 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 26443 or 46873 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells that are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six (6) weeks).

[0225] One strategy used to maximize recombinant protein expression in E. coli is to express the protein in a host strain with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990) 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[0226] The 26443 or 46873 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector, or a vector suitable for expression in mammalian cells.

[0227] When used in mammalian cells, the expression vector's control functions are often provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[0228] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the α-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[0229] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus. For a discussion of the regulation of gene expression using antisense genes, see Weintraub, H. et al., Antisense RNA as a molecular tool for genetic analysis, Reviews—Trends in Genetics, Vol. 1(1) 1986.

[0230] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 26443 or 46873 nucleic acid molecule within a recombinant expression vector or a 26443 or 46873 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell, but also to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[0231] A host cell can be any prokaryotic or eukaryotic cell. For example, a 26443 or 46873 protein can be expressed in bacterial cells such as E. coli, insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are known to those skilled in the art.

[0232] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[0233] A host cell of the invention can be used to produce (i.e., express) a 26443 or 46873 protein. Accordingly, the invention further provides methods for producing a 26443 or 46873 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 26443 or 46873 protein has been introduced) in a suitable medium such that a 26443 or 46873 protein is produced. In another embodiment, the method further includes isolating a 26443 or 46873 protein from the medium or the host cell.

[0234] In another aspect, the invention features, a cell or purified preparation of cells that include a 26443 or 46873 transgene, or which otherwise misexpress 26443 or 46873. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 26443 or 46873 transgene, e.g., a heterologous form of a 26443 or 46873, e.g., a gene derived from humans (in the case of a non-human cell). The 26443 or 46873 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that misexpresses an endogenous 26443 or 46873, e.g., a gene for which expression is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or misexpressed 26443 or 46873 alleles or for use in drug screening.

[0235] In another aspect, the invention features, a human cell, e.g., a tumor cell, transformed with a nucleic acid that encodes a subject 26443 or 46873 polypeptide.

[0236] Also provided are cells, e.g., human cells, e.g., human hematopoietic or fibroblast cells, in which an endogenous 26443 or 46873 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 26443 or 46873 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 26443 or 46873 gene. For example, an endogenous 26443 or 46873 gene, e.g., a gene that is “transcriptionally silent”, e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques, such as targeted homologous recombination, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[0237] 26443 and 46873 Transgenic Animals

[0238] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 26443 or 46873 protein and for identifying and/or evaluating modulators of 26443 or 46873 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal include a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 26443 or 46873 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[0239] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 26443 or 46873 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 26443 or 46873 transgene in its genome and/or expression of 26443 or 46873 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 26443 or 46873 protein can further be bred to other transgenic animals carrying other transgenes.

[0240] 26443 or 46873 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[0241] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[0242] Uses of 26443 and 46873

[0243] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[0244] The isolated nucleic acid molecules of the invention can be used, for example, to express a 26443 or 46873 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 26443 or 46873 mRNA (e.g., in a biological sample) or a genetic alteration in a 26443 or 46873 gene, and to modulate 26443 or 46873 activity, as described further below. The 26443 or 46873 proteins can be used to treat disorders characterized by insufficient or excessive production of a 26443 or 46873 substrate or production of 26443 or 46873 inhibitors. In addition, the 26443 or 46873 proteins can be used to screen for naturally occurring 26443 or 46873 substrates, to screen for drugs or compounds which modulate 26443 or 46873 activity, as well as to treat disorders characterized by insufficient or excessive production of 26443 or 46873 protein or production of 26443 or 46873 protein forms which have decreased, aberrant or unwanted activity compared to 26443 or 46873 wild type protein (e.g., altered cellular levels of asparagine and/or aspartic acid and ammonia). Moreover, the anti-26443 or -46873 antibodies of the invention can be used to detect and isolate 26443 or 46873 proteins, regulate the bioavailability of 26443 or 46873 proteins, and modulate 26443 or 46873 activity.

[0245] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 26443 or 46873 polypeptide is provided. The method includes: contacting the compound with the subject 26443 or 46873 polypeptide; and evaluating the ability of the compound to interact with, e.g., to bind or form a complex with, the subject 26443 or 46873 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with a subject 26443 or 46873 polypeptide. It can also be used to find natural or synthetic inhibitors of a subject 26443 or 46873 polypeptide. Screening methods are discussed in more detail below.

[0246] 26443 and 46873 Screening Assays

[0247] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 26443 or 46873 proteins, have a stimulatory or inhibitory effect on, for example, 26443 or 46873 expression or 26443 or 46873 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 26443 or 46873 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 26443 or 46873 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[0248] In one embodiment, the invention provides assays for screening candidate or test compounds that are substrates of a 26443 or 46873 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate the activity of a 26443 or 46873 protein or polypeptide or a biologically active portion thereof.

[0249] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries [libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive] (see, e.g., Zuckermann, R. N. et al. J. Med. Chem. 1994, 37: 2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam, K. S. (1997) Anticancer Drug Des. 12:145).

[0250] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and in Gallop et al. (1994) J. Med. Chem. 37:1233.

[0251] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. '409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390); (Devlin (1990) Science 249:404-406); (Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382); (Felici (1991) J. Mol. Biol. 222:301-310); (Ladner supra.).

[0252] In one embodiment, an assay is a cell-based assay in which a cell that expresses a 26443 or 46873 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 26443 or 46873 activity is determined. Determining the ability of the test compound to modulate 26443 or 46873 activity can be accomplished by monitoring, for example, cellular asparagine levels. The cell, for example, can be of mammalian origin, e.g., a tumor cell.

[0253] The ability of the test compound to modulate 26443 or 46873 binding to a compound, e.g., a 26443 or 46873 substrate, or to bind to 26443 or 46873 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 26443 or 46873 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 26443 or 46873 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 26443 or 46873 binding to a 26443 or 46873 substrate in a complex. For example, compounds (e.g., 26443 or 46873 substrates) can be labeled with ¹²⁵I, ³⁵S, ¹⁴C, or ³H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[0254] The ability of a compound (e.g., a 26443 or 46873 substrate) to interact with 26443 or 46873 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 26443 or 46873 without the labeling of either the compound or the 26443 or 46873. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 26443 or 46873.

[0255] In yet another embodiment, a cell-free assay is provided in which a 26443 or 46873 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 26443 or 46873 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 26443 or 46873 proteins to be used in assays of the present invention include fragments that participate in interactions with non-26443 or -46873 molecules, e.g., fragments with high surface probability scores.

[0256] Soluble and/or membrane-bound forms of isolated proteins (e.g., 26443 or 46873 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)_(n), 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[0257] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[0258] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that it's emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[0259] In another embodiment, determining the ability of the 26443 or 46873 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal that can be used as an indication of real-time reactions between biological molecules.

[0260] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[0261] It may be desirable to immobilize either 26443 or 46873, an anti-26443 or -46873 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 26443 or 46873 protein, or interaction of a 26443 or 46873 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase-26443 or -46873 fusion proteins or glutathione-S-transferase-target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 26443 or 46873 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 26443 or 46873 binding or activity determined using standard techniques.

[0262] Other techniques for immobilizing either a 26443 or 46873 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 26443 or 46873 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[0263] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[0264] In one embodiment, this assay is performed utilizing antibodies reactive with 26443 or 46873 protein or target molecules but which do not interfere with binding of the 26443 or 46873 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 26443 or 46873 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 26443 or 46873 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 26443 or 46873 protein or target molecule.

[0265] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., Trends Biochem Sci 1993 August;18(8):284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., J Mol Recognit 1998 Winter; 11(1-6):141-8; Hage, D. S., and Tweed, S. A. J Chromatogr B Biomed Sci Appl 1997 Oct. 10;699(1-2):499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[0266] In a preferred embodiment, the assay includes contacting the 26443 or 46873 protein or biologically active portion thereof with a known compound which binds 26443 or 46873 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 26443 or 46873 protein, wherein determining the ability of the test compound to interact with a 26443 or 46873 protein includes determining the ability of the test compound to preferentially bind to 26443 or 46873 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[0267] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 26443 or 46873 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 26443 or 46873 protein through modulation of the activity of a downstream effector of a 26443 or 46873 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[0268] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[0269] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[0270] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[0271] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[0272] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[0273] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[0274] In yet another aspect, the 26443 or 46873 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 26443 or 46873 (“26443- or 46873-binding proteins” or “26443- or 46873-bp”) and are involved in 26443 or 46873 activity. Such 26443- or 46873-bps can be activators or inhibitors of signals by the 26443 or 46873 proteins or 26443 or 46873 targets as, for example, downstream elements of a 26443- or 46873-mediated signaling pathway.

[0275] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 26443 or 46873 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively, the 26443 or 46873 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 26443- or 46873-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., LacZ) that is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene that encodes the protein that interacts with the 26443 or 46873 protein.

[0276] In another embodiment, modulators of 26443 or 46873 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 26443 or 46873 mRNA or protein evaluated relative to the level of expression of 26443 or 46873 mRNA or protein in the absence of the candidate compound. When expression of 26443 or 46873 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 26443 or 46873 mRNA or protein expression. Alternatively, when expression of 26443 or 46873 mRNA or protein is less (e.g., statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 26443 or 46873 mRNA or protein expression. The level of 26443 or 46873 mRNA or protein expression can be determined by methods described herein for detecting 26443 or 46873 mRNA or protein.

[0277] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 26443 or 46873 protein can be confirmed in vivo, e.g., in an animal such as an animal model for aberrant fatty acid oxidation.

[0278] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 26443 or 46873 modulating agent, an antisense 26443 or 46873 nucleic acid molecule, a 26443- or 46873-specific antibody, or a 26443- or 46873-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[0279] 26443 and 46873 Detection Assays

[0280] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 26443 or 46873 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[0281] 26443 and 46873 Chromosome Mapping

[0282] The 26443 or 46873 nucleotide sequences or portions thereof can be used to map the location of the 26443 or 46873 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 26443 or 46873 sequences with genes associated with disease.

[0283] Briefly, 26443 or 46873 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 26443 or 46873 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 26443 or 46873 sequences will yield an amplified fragment.

[0284] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[0285] Other mapping strategies, e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 26443 or 46873 to a chromosomal location.

[0286] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques (Pergamon Press, New York 1988).

[0287] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to non-coding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[0288] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[0289] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 26443 or 46873 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[0290] 26443 and 46873 Tissue Typing

[0291] 26443 or 46873 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[0292] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 26443 or 46873 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[0293] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the non-coding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the non-coding regions, fewer sequences are necessary to differentiate individuals. The non-coding sequences of SEQ ID NO: 1 or SEQ ID NO:4 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a non-coding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:3 or SEQ ID NO:6 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[0294] If a panel of reagents from 26443 or 46873 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[0295] Use of Partial 26443 or 46873 Sequences in Forensic Biology

[0296] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[0297] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to non-coding regions of SEQ ID NO:1 or SEQ ID NO:4 (e.g., fragments derived from the non-coding regions of SEQ ID NO:1 or SEQ ID NO:4 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[0298] The 26443 or 46873 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue, e.g., a tissue containing organelles having asparaginase. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 26443 or 46873 probes can be used to identify tissue by species and/or by organ type.

[0299] In a similar fashion, these reagents, e.g., 26443 or 46873 primers or probes can be used to screen tissue culture for contamination (i.e., screen for the presence of a mixture of different types of cells in a culture). Predictive Medicine of 26443 and 46873 The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[0300] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene that encodes asparaginase.

[0301] Such disorders include, e.g., a disorder associated with the misexpression of an asparaginase; or a metabolic disorder, e.g., a disorder involving inappropriate cellular asparagine levels.

[0302] The method includes one or more of the following:

[0303] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 26443 or 46873 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′ control region;

[0304] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 26443 or 46873 gene;

[0305] detecting, in a tissue of the subject, the misexpression of the 26443 or 46873 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[0306] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 26443 or 46873 polypeptide.

[0307] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 26443 or 46873 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[0308] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO: 1 or 3 or naturally occurring mutants thereof, or 5′ or 3′flanking sequences naturally associated with the 26443 or 46873 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[0309] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 26443 or 46873 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 26443 or 46873.

[0310] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[0311] In preferred embodiments the method includes determining the structure of a 26443 or 46873 gene, an abnormal structure being indicative of risk for the disorder.

[0312] In preferred embodiments the method includes contacting a sample form the subject with an antibody to the 26443 or 46873 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[0313] Diagnostic and Prognostic Assays of 26443 and 46873

[0314] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 26443 or 46873 molecules and for identifying variations and mutations in the sequence of 26443 or 46873 molecules.

[0315] Expression Monitoring and Profiling:

[0316] The presence, level, or absence of 26443 or 46873 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 26443 or 46873 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 26443 or 46873 protein such that the presence of 26443 or 46873 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 26443 or 46873 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 26443 or 46873 gene; measuring the amount of protein encoded by the 26443 or 46873 gene; or measuring the activity of the protein encoded by the 26443 or 46873 gene.

[0317] The level of mRNA corresponding to the 26443 or 46873 gene in a cell can be determined by both in situ and in vitro formats.

[0318] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 26443 or 46873 nucleic acid, such as the nucleic acid of SEQ ID NO: 1, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 26443 or 46873 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[0319] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 26443 or 46873 gene.

[0320] The level of mRNA in a sample that is encoded by one of 26443 or 46873 can be evaluated with nucleic acid amplification, e.g., by RT-PCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[0321] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 26443 or 46873 gene being analyzed.

[0322] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 26443 or 46873 mRNA, or genomic DNA, and comparing the presence of 26443 or 46873 mRNA or genomic DNA in the control sample with the presence of 26443 or 46873 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 26443 or 46873 transcript levels.

[0323] A variety of methods can be used to determine the level of protein encoded by 26443 or 46873. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)₂) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[0324] The detection methods can be used to detect 26443 or 46873 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 26443 or 46873 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 26443 or 46873 proteins include introducing into a subject a labeled anti-26443 or -46873 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-26443 or -46873 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[0325] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 26443 or 46873 proteins, and comparing the presence of 26443 or 46873 proteins in the control sample with the presence of 26443 or 46873 proteins in the test sample.

[0326] The invention also includes kits for detecting the presence of 26443 or 46873 in a biological sample. For example, the kit can include a compound or agent capable of detecting 26443 or 46873 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 26443 or 46873 protein or nucleic acid.

[0327] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[0328] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein-stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples that can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[0329] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with mis-expressed or aberrant or unwanted 26443 or 46873 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as pain or deregulated cell proliferation.

[0330] In one embodiment, a disease or disorder associated with aberrant or unwanted 26443 or 46873 expression or activity is identified. A test sample is obtained from a subject and 26443 or 46873 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 26443 or 46873 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 26443 or 46873 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[0331] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 26443 or 46873 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a disorder involving aberrant or unwanted 26443 or 46873 expression or activity.

[0332] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 26443 or 46873 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 26443 or 46873 (e.g., other genes associated with a 26443- or 46873-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[0333] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 26443 or 46873 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a disorder involving aberrant or unwanted 26443 or 46873 expression or activity in a subject. The method can be used to monitor a treatment for a disorder involving aberrant or unwanted 26443 or 46873 expression or activity in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[0334] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 26443 or 46873 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[0335] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 26443 or 46873 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[0336] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[0337] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 26443 or 46873 expression.

[0338] 26443 and 46873 Arrays and Uses Thereof

[0339] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 26443 or 46873 molecule (e.g., a 26443 or 46873 nucleic acid or a 26443 or 46873 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm², and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[0340] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 26443 or 46873 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 26443 or 46873. Each address of the subset can include a capture probe that hybridizes to a different region of a 26443 or 46873 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 26443 or 46873 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 26443 or 46873 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 26443 or 46873 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[0341] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[0342] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 26443 or 46873 polypeptide or fragment thereof. The polypeptide can be a naturally occurring interaction partner of 26443 or 46873 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-26443 or -46873 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[0343] In another aspect, the invention features a method of analyzing the expression of 26443 or 46873. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 26443 or 46873-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[0344] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 26443 or 46873. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 26443 or 46873. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[0345] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 26443 or 46873 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[0346] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[0347] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 26443- or 46873-associated disease or disorder; and processes, such as a cellular transformation associated with a 26443- or 46873-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 26443- or 46873-associated disease or disorder

[0348] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 26443 or 46873) that could serve as a molecular target for diagnosis or therapeutic intervention.

[0349] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 26443 or 46873 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 26443 or 46873 polypeptide or fragment thereof. For example, multiple variants of a 26443 or 46873 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[0350] The polypeptide array can be used to detect a 26443 or 46873 binding compound, e.g., an antibody in a sample from a subject with specificity for a 26443 or 46873 polypeptide or the presence of a 26443- or 46873-binding protein or ligand.

[0351] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 26443 or 46873 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[0352] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 26443 or 46873 or from a cell or subject in which a 26443 or 46873 mediated response has been elicited, e.g., by contact of the cell with 26443 or 46873 nucleic acid or protein, or administration to the cell or subject 26443 or 46873 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 26443 or 46873 (or does not express as highly as in the case of the 26443 or 46873 positive plurality of capture probes) or from a cell or subject which in which a 26443- or 46873-mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 26443 or 46873 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[0353] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 26443 or 46873 or from a cell or subject in which a 26443- or 46873-mediated response has been elicited, e.g., by contact of the cell with 26443 or 46873 nucleic acid or protein, or administration to the cell or subject 26443 or 46873 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 26443 or 46873 (or does not express as highly as in the case of the 26443 or 46873 positive plurality of capture probes) or from a cell or subject which in which a 26443 or 46873 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[0354] In another aspect, the invention features a method of analyzing 26443 or 46873, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 26443 or 46873 nucleic acid or amino acid sequence; comparing the 26443 or 46873 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 26443 or 46873.

[0355] Detection of 26443 and 46873 Variations or Mutations

[0356] The methods of the invention can also be used to detect genetic alterations in a 26443 or 46873 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by mis-regulation in 26443 or 46873 protein activity or nucleic acid expression, such as a disorder involving aberrant or unwanted 26443 or 46873 expression or activity. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 26443 or 46873 proteins, or the mis-expression of the 26443 or 46873 genes. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 26443 or 46873 gene; 2) an addition of one or more nucleotides to a 26443 or 46873 gene; 3) a substitution of one or more nucleotides of a 26443 or 46873 gene, 4) a chromosomal rearrangement of a 26443 or 46873 gene; 5) an alteration in the level of a messenger RNA transcript of a 26443 or 46873 gene, 6) aberrant modification of a 26443 or 46873 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 26443 or 46873 gene, 8) a non-wild type level of a 26443 or 46873 protein, 9) allelic loss of a 26443 or 46873 gene, and

[0357] 10) inappropriate post-translational modification of a 26443 or 46873 protein.

[0358] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE-PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 26443 or 46873 gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 26443 or 46873 gene under conditions such that hybridization and amplification of the 26443 or 46873 gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[0359] In another embodiment, mutations in a 26443 or 46873 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[0360] In other embodiments, genetic mutations in 26443 or 46873 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 26443 or a 46873 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 26443 or 46873 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 26443 or 46873 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[0361] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 26443 or 46873 gene and detect mutations by comparing the sequence of the sample 26443 or 46873 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[0362] Other methods for detecting mutations in the 26443 or 46873 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[0363] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 26443 or 46873 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[0364] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 26443 or 46873 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 26443 or 46873 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to the sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[0365] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[0366] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[0367] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[0368] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 26443 or 46873 nucleic acid.

[0369] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:1, 3, 4 or 6 or the complement of SEQ ID NO:1, 3, 4 or 6. Different locations can be different but overlapping or or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[0370] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 26443 or 46873. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a bi-allelic or polymorphic locus.

[0371] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the T_(m) of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[0372] In a preferred embodiment the set of oligonucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 26443 or 46873 nucleic acid.

[0373] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 26443 or 46873 gene.

[0374] Use of 26443 or 46873 Molecules as Surrogate Markers

[0375] The 26443 or 46873 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 26443 or 46873 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 26443 or 46873 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker that correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[0376] The 26443 or 46873 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker that correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 26443 or 46873 marker) transcription or expression, the amplified marker may be in a quantity that is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-26443 or -46873 antibodies may be employed in an immune-based detection system for a 26443 or 46873 protein marker, or 26443- or 46873-specific radiolabeled probes may be used to detect a 26443 or 46873 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[0377] The 26443 or 46873 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker that correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 26443 or 46873 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 26443 or 46873 DNA may correlate 26443 or 46873 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[0378] Pharmaceutical Compositions of 26443 and 46873

[0379] The nucleic acid and polypeptides, fragments thereof, as well as anti-26443 or 46873 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein, the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[0380] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[0381] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including an agent in the composition that delays absorption, for example, aluminum monostearate and gelatin.

[0382] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle that contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[0383] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[0384] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser that contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[0385] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[0386] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[0387] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[0388] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[0389] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD₅₀ (the dose lethal to 50% of the population) and the ED₅₀ (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD₅₀/ED₅₀. Compounds that exhibit high therapeutic indeces are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[0390] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED₅₀ with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC₅₀ (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[0391] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[0392] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[0393] The present invention encompasses agents that modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[0394] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[0395] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive metal ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, and puromycin and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine and vinblastine).

[0396] The conjugates of the invention can be used for modifying a given biological response, although the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, alpha-interferon, beta-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[0397] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate, as described by Segal in U.S. Pat. No. 4,676,980.

[0398] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[0399] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[0400] Methods of Treatment for 26443 and 46873

[0401] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 26443 or 46873 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[0402] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 26443 or 46873 molecules of the present invention or 26443 or 46873 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[0403] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 26443 or 46873 expression or activity, by administering to the subject a 26443 or 46873 or an agent which modulates 26443 or 46873 expression or at least one 26443 or 46873 activity. Subjects at risk for a disease that is caused or contributed to by aberrant or unwanted 26443 or 46873 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 26443 or 46873 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 26443 or 46873 aberrance, for example, a 26443 or 46873, 26443 or 46873 agonist or 26443 or 46873 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[0404] It is possible that some 26443 or 46873 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[0405] As discussed, successful treatment of 26443 or 46873 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 26443 or 46873 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)₂ and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[0406] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[0407] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[0408] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 26443 or 46873 expression is through the use of aptamer molecules specific for 26443 or 46873 protein. Aptamers are nucleic acid molecules having a tertiary structure that permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. Curr. Opin. Chem Biol. 1997, 1(1): 5-9; and Patel, D. J. Curr Opin Chem Biol 1997 June; 1(1):32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 26443 or 46873 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[0409] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 26443 or 46873 disorders. For a description of antibodies, see the Antibody section above.

[0410] In circumstances wherein injection of an animal or a human subject with a 26443 or 46873 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 26443 or 46873 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. Ann Med 1999;31(1):66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. Cancer Treat Res 1998;94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 26443 or 46873 protein. Vaccines directed to a disease characterized by 26443 or 46873 expression may also be generated in this fashion.

[0411] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al., 1993, Proc. Natl. Acad. Sci. USA 90:7889-7893).

[0412] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 26443 or 46873 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders.

[0413] Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 26443 or 46873 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix that contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 26443 or 46873 can be readily monitored and used in calculations of IC₅₀.

[0414] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC₅₀. A rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[0415] Another aspect of the invention pertains to methods of modulating 26443 or 46873 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 26443 or 46873 or agent that modulates one or more of the activities of 26443 or 46873 protein activity associated with the cell. An agent that modulates 26443 or 46873 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 26443 or 46873 protein (e.g., a 26443 or 46873 substrate or receptor), a 26443 or 46873 antibody, a 26443 or 46873 agonist or antagonist, a peptidomimetic of a 26443 or 46873 agonist or antagonist, or other small molecule.

[0416] In one embodiment, the agent stimulates one or 26443 or 46873 activities. Examples of such stimulatory agents include active 26443 or 46873 protein and a nucleic acid molecule encoding 26443 or 46873. In another embodiment, the agent inhibits one or more 26443 or 46873 activities. Examples of such inhibitory agents include antisense 26443 or 46873 nucleic acid molecules, anti26443 or 46873 antibodies, and 26443 or 46873 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 26443 or 46873 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., upregulates or downregulates) 26443 or 46873 expression or activity. In another embodiment, the method involves administering a 26443 or 46873 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 26443 or 46873 expression or activity.

[0417] Stimulation of 26443 or 46873 activity is desirable in situations in which 26443 or 46873 is abnormally downregulated and/or in which increased 26443 or 46873 activity is likely to have a beneficial effect. For example, stimulation of 26443 or 46873 activity is desirable in situations in which a 26443 or 46873 is downregulated and/or in which increased 26443 or 46873 activity is likely to have a beneficial effect. Likewise, inhibition of 26443 or 46873 activity is desirable in situations in which 26443 or 46873 is abnormally upregulated and/or in which decreased 26443 or 46873 activity is likely to have a beneficial effect.

[0418] Examples of other disorders which can be treated include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[0419] In addition, aberrant activity of a 26443 or 46873 polypeptide may adversely affect a muscle cell. Examples of disorders involving, for example, heart muscle, or “cardiovascular disorders”, include, but are not limited to, a disease, disorder, or state involving the cardiovascular system, e.g., the heart, and or coronary blood vessels. A cardiovascular disorder can be caused by a malfunction of the heart, an imbalance in arterial pressure or an occlusion of a blood vessel, e.g., by a thrombus. Examples of such disorders include arrhythmias, myocardial infarction, hypertension, athlerosclerosis, coronary artery spasm, congestive heart failure, coronary artery disease, valvular disease and cardiomyopathies. Additionally, skeletal muscle cells may be affected by aberrant activity of a 26443 or 46873 polypeptide. For instance, symptoms of a skeletal muscular disorder may include aching muscles, muscle cramps or muscle degeneracy.

[0420] Examples of liver disorders include, but are not limited to, disorders associated with an accumulation of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers; hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic); hepatic injury, such as portal hypertension or hepatic fibrosis; liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, e.g., A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome); liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder, such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[0421] Additionally, 26443 or 46873 may play an important role in overall metabolism. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia and lipid disorders diabetes.

[0422] Moreover, a 26443 or 46873 protein may regulate cellular amino acid levels (e.g., asparagine, aspartic acid). A defect or deficiency in a 26443 or 46873 polypeptide, therefore, may result in inappropriate levels of, e.g., asparagine and/or aspartic acid, thereby causing a variety of disorders, for example, neurological disorders. Examples of neural disorders include, but are not limited to, neurodegenerative disorders, e.g., Alzheimer's disease, dementias related to Alzheimer's disease (such as Pick's disease), Parkinson's and other Lewy diffuse body diseases, multiple sclerosis, amyotrophic lateral sclerosis, progressive supranuclear palsy, epilepsy, and Jakob-Creutzfieldt disease; psychiatric disorders, e.g., depression, schizophrenic disorders, Korsakoff's psychosis, mania, anxiety disorders, or phobic disorders; learning or memory disorders, e.g., amnesia or age-related memory loss; and neurological disorders, e.g., migraine. The ability to regulate or control the expression of a 26443 or 46873 protein may result in the ability to likewise regulate or control levels of amino acids, e.g., asparagine or aspartic acid, thereby providing a protective and/or therapeutic effect against, e.g., neurological disorders.

[0423] Thus, the 26443 or 46873 molecules can act as novel diagnostic targets and therapeutic agents for controlling defects resulting in metabolic deficiencies and/or improper amino acid levels, e.g., asparagine or aspartic acid.

[0424] Aberrant expression and/or activity of 26443 or 46873 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 26443 or 46873 molecules effects in bone cells, e.g., osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 26443 or 46873 molecules may support different activities of bone resorbing osteoclasts, such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 26443 or 46873 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[0425] Additionally, 26443 or 46873 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Heptitis C and Herpes Simplex Virus (HSV). Modulators of 26443 or 46873 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 26443 or 46873 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer and lymphomas.

[0426] Additionally, 26443 or 46873 may play an important role in the regulation of pain disorders. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[0427] 26443 and 46873 Pharmacogenomics

[0428] The 26443 or 46873 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 26443 or 46873 activity (e.g., 26443 or 46873 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically)26443 or 46873 associated disorders (e.g., metabolic disorders or defects associated with fatty acid oxidation) associated with aberrant or unwanted 26443 or 46873 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 26443 or 46873 molecule or 26443 or 46873 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 26443 or 46873 molecule or 26443 or 46873 modulator.

[0429] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23(10-11): 983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43(2):254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[0430] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high-resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[0431] Alternatively, a method termed the “candidate gene approach”, can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 26443 or 46873 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[0432] Alternatively, a method termed “gene expression profiling” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 26443 or 46873 molecule or 26443 or 46873 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[0433] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 26443 or 46873 molecule or 26443 or 46873 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[0434] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 26443 or 46873 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 26443 or 46873 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[0435] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 26443 or 46873 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 26443 or 46873 gene expression, protein levels, or upregulate 26443 or 46873 activity, can be monitored in clinical trials of subjects exhibiting decreased 26443 or 46873 gene expression, protein levels, or downregulated 26443 or 46873 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 26443 or 46873 gene expression, protein levels, or downregulate 26443 or 46873 activity, can be monitored in clinical trials of subjects exhibiting increased 26443 or 46873 gene expression, protein levels, or upregulated 26443 or 46873 activity. In such clinical trials, the expression or activity of a 26443 or 46873 gene, and preferably, other genes that have been implicated in, for example, a 26443- or 46873-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[0436] 26443 or 46873 Informatics

[0437] The sequence of a 26443 or 46873 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 26443 or 46873. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 26443 or 46873 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[0438] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[0439] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[0440] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[0441] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention that match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[0442] Thus, in one aspect, the invention features a method of analyzing 26443 or 46873, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 26443 or 46873 nucleic acid or amino acid sequence; comparing the 26443 or 46873 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 26443 or 46873. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[0443] The method can include evaluating the sequence identity between a 26443 or 46873 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[0444] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[0445] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[0446] Thus, the invention features a method of making a computer readable record of a sequence of a 26443 or 46873 sequence that includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[0447] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 26443 or 46873 sequence, or record, in machine-readable form; comparing a second sequence to the 26443 or 46873 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 26443 or 46873 sequence includes a sequence being compared. In a preferred embodiment the 26443 or 46873 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 26443 or 46873 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[0448] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 26443- or 46873-associated disease or disorder or a pre-disposition to a 26443- or 46873-associated disease or disorder, wherein the method comprises the steps of determining 26443 or 46873 sequence information associated with the subject and based on the 26443 or 46873 sequence information, determining whether the subject has a 26443- or 46873-associated disease or disorder or a pre-disposition to a 26443- or 46873-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[0449] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 26443- or 46873-associated disease or disorder or a pre-disposition to a disease associated with a 26443 or 46873 wherein the method comprises the steps of determining 26443 or 46873 sequence information associated with the subject, and based on the 26443 or 46873 sequence information, determining whether the subject has a 26443- or 46873-associated disease or disorder or a pre-disposition to a 26443- or 46873-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 26443 or 46873 sequence of the subject to the 26443 or 46873 sequences in the database to thereby determine whether the subject as a 26443- or 46873-associated disease or disorder, or a pre-disposition for such.

[0450] The present invention also provides in a network, a method for determining whether a subject has a 26443- or 46873-associated disease or disorder or a pre-disposition to a 26443- or 46873-associated disease or disorder associated with 26443 or 46873, said method comprising the steps of receiving 26443 or 46873 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 26443 or 46873 and/or corresponding to a 26443- or 46873-associated disease or disorder (e.g., a disorder associated with aberrant or unwanted 26443 or 46873 expression or activity), and based on one or more of the phenotypic information, the 26443 or 46873 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 26443- or 46873-associated disease or disorder or a pre-disposition to a 26443- or 46873-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[0451] The present invention also provides a method for determining whether a subject has a 26443- or 46873-associated disease or disorder or a pre-disposition to a 26443- or 46873-associated disease or disorder, said method comprising the steps of receiving information related to 26443 or 46873 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 26443 or 46873 and/or related to a 26443- or 46873-associated disease or disorder, and based on one or more of the phenotypic information, the 26443 or 46873 information, and the acquired information, determining whether the subject has a 26443- or 46873-associated disease or disorder or a pre-disposition to a 26443- or 46873-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[0452] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 61833 Invention

[0453] Amino acid decarboxylases are vitamin-B6-dependent enzymes (B6 enzymes). Most amino acid decarboxylases use pyridoxal-5′-phosphate (pyridoxal-P) as a coenzyme. In mammals, amino acid decarboxylases are responsible for the biosynthesis of biogenic amines and polyamines. In bacteria, constitutive amino acid decarboxylases appear to fulfill similar biosynthetic functions, whereas the decarboxylation of amino acids by inducible biodegenerative enzymes has been proposed to contribute to the regulation of pH both inside the cell and in its environment (Boeker and Snell, The Enzymes, 3rd edition, 6:217-253, 1972).

[0454] Decarboxylases can be subdivided into four groups that do not appear to be evolutionarily related. Group 1 includes glycine decarboxylase, which is part of a multienzyme complex. Group II contains glutamate, histidine, tyrosine and aromatic-L-amino acid decarboxlases. Group III includes prokaryotic omithine, lysine, and arginine decarboxylases, while Group IV contains eukaryotic omithine and arginine decarboxylase, and prokaryotic arginine decarboxylase and diaminopimelate decarboxylase.

[0455] The Group IV pyridoxal-dependent decarboxylases are particularly important due to their role in the synthesis of the polyamines putrescine, spermidine, and spermine. Within eukaryotic cells, the levels of polyamines have been positively correlated with cell cycle progression, and inhibition of polyamine synthesis causes cell growth arrest and cell death (Pegg (1986), Biochem J 234: 249-62; Tabor and Tabor (1984), Ann Rev Biochem 53: 749-90; Stefanelli et al. (2001), Biochem J 355:199-206). In addition, polyamines have been shown to decrease platelet aggregation and modulate the activity of voltage-gated sodium channels (de la Pena et al. (2000), Arch Med Res 31(6): 546-550; Huang and Moczydlowski (2001), Biophys J 80(3): 1262-79). Although putrescine, spermidine, and spermine appear to have slightly different activities, they are related by a common synthetic pathway. Putrescine can be converted into spermidine via the action of the aminopropyltransferase spermidine synthase, and a second aminopropyltransferase, spermine synthase, gives rise to spermine through the addition of another propylamine group to spermidine.

[0456] Eukaryotic ornithine decarboxylase plays a role in the biosynthesis of polyamines through its production of putrescine from the decarboxylation of omithine. Consistent with the positive correlation between polyamine levels and cell growth, recent studies have shown that omithine decarboxylase is upregulated in benign prostatic hyperplasia (BHP) samples taken from human prostates (Liu et al. (2000), Prostate 43(2): 83-7). Similarly, male Wistar rats that have been treated with 1,2-dimethylhydrazine (DMH), thereby leading to the formation of colonic premalignant aberrant crypt foci, and given an oral dose of omithine produce higher levels of blood putrescine than control rats that have not been treated with DMH (Schleiffer et al. (2000), Cancer Detect Prev 24(6): 542-8). These studies indicate a correlation between increased omithine decarboxylase expression and activity and the development of a hyperplastic cellular phenotype.

Summary of the 61833 Invention

[0457] The present invention is based, in part, on the discovery of a novel pyridoxyl -dependent decarboxylase family member, referred to herein as “61833”. The nucleotide sequence of a cDNA encoding 61833 is shown in SEQ ID NO:10, and the amino acid sequence of a 61833 polypeptide is shown in SEQ ID NO: 11 (see also Example 5). In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO: 12.

[0458] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 61833 protein or polypeptide, e.g., a biologically active portion of the 61833 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO: 11. In other embodiments, the invention provides isolated 61833 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO: 10, SEQ ID NO: 12, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:10, SEQ ID NO:12, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 10, SEQ ID NO: 12, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 61833 protein or an active fragment thereof.

[0459] In a related aspect, the invention further provides nucleic acid constructs that include a 61833 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 61833 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 61833 nucleic acid molecules and polypeptides.

[0460] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 61833-encoding nucleic acids.

[0461] In still another related aspect, isolated nucleic acid molecules that are antisense to a 61833 encoding nucleic acid molecule are provided.

[0462] In another aspect, the invention features, 61833 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 61833-mediated or -related disorders. In another embodiment, the invention provides 61833 polypeptides having a 61833 activity. Preferred polypeptides are 61833 proteins including at least one decarboxylase domain, e.g., a pyridoxal-dependent decarboxylase domain, e.g., an omithine decarboxylase, and, preferably, having a 61833 activity, e.g., a 61833 activity as described herein.

[0463] In other embodiments, the invention provides 61833 polypeptides, e.g., a 61833 polypeptide having the amino acid sequence shown in SEQ ID NO: 11 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO: 11 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 10, SEQ ID NO: 12, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 61833 protein or an active fragment thereof.

[0464] In a related aspect, the invention provides 61833 polypeptides or fragments operatively linked to non-61833 polypeptides to form fusion proteins.

[0465] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 61833 polypeptides.

[0466] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 61833 polypeptides or nucleic acids.

[0467] In still another aspect, the invention provides a process for modulating 61833 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 61833 polypeptides or nucleic acids, such as conditions involving aberrant or deficient cellular proliferation or differentiation. The screened compounds can modulate the decarboxylase activity of a 61833 polypeptide.

[0468] The invention also provides assays for determining the activity of or the presence or absence of 61833 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis.

[0469] In one aspect, the invention provides a method of evaluating a sample. The method includes: providing a sample; detecting a 61833 polypeptide or nucleic acid in the sample; and, optionally, comparing the level of expressed 61833 molecules to a reference sample. In one embodiment, an increased level of 61833 molecules is an indication that the sample includes cells in mitosis. In another embodiment, the level of 61833 molecules is an indication that a sample includes a proliferating cell, e.g., a proliferating colon, liver, lung, breast, or ovary cell.

[0470] In yet another aspect, the invention provides methods for inhibiting the proliferation, or inducing the killing, of a 61833-expressing cell, e.g., a hyper-proliferative 61833-expressing cell. The method includes contacting the cell with a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 61833 polypeptide or nucleic acid. In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol. In a preferred embodiment, the cell is a hyperproliferative cell, e.g., a cell found in a solid tumor, a soft tissue tumor, or a metastatic lesion.

[0471] In a preferred embodiment, the compound is an inhibitor of a 61833 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another preferred embodiment, the compound is an inhibitor of a 61833 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[0472] In a preferred embodiment, the compound is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[0473] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant cellular proliferation or differentiation of a 61833-expressing cell, in a subject. Preferably, the method includes comprising administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 61833 polypeptide or nucleic acid. In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition.

[0474] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., proliferative disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 61833 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 61833 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 61833 nucleic acid or polypeptide expression can be detected by any method described herein.

[0475] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 61833 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[0476] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 61833 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 61833 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 61833 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue.

[0477] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 61833 polypeptide or nucleic acid molecule, including for disease diagnosis.

[0478] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 61833 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 61833 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 61833 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[0479] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 61833

[0480] The human 61833 sequence (see SEQ ID NO: 10, as recited in Example 5), which is approximately 1937 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 1383 nucleotides, including the termination codon. The coding sequence encodes a 460 amino acid protein (see SEQ ID NO: 11, as recited in Example 5).

[0481] Human 61833 contains the following regions or other structural features:

[0482] one pyridoxyl-dependent decarboxylase domain (PFAM Accession Number PF00278) located at about amino acid residues 41 to 401 of SEQ ID NO: 11;

[0483] one decarboxylase family 2 pridoxal phosphate attachment site (PS00878) located at about amino acid residues 67 to 85 of SEQ ID NO:11;

[0484] one decarboxylase family 2 signature motif (PS00879) located at about amino acid residues 230 to 241 of SEQ ID NO: 11;

[0485] six predicted Protein Kinase C phosphorylation sites (PS00005) located at about amino acid residues 18 to 20, 149 to 151, 168 to 170, 174 to 176, 177 to 179, and 311 to 313 of SEQ ID NO: 11;

[0486] five predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino acids 6 to 9, 18 to 21, 34 to 37, 212 to 215, and 258 to 261 of SEQ ID NO:11;

[0487] one predicted cAMP/cGMP-dependent protein kinase phosphorylation site (PS00004) located at about amino acids 343 to 346 of SEQ ID NO: 11;

[0488] one predicted tyrosine kinase phosphorylation site (PS00007) located at about amino acid residues 344 to 352;

[0489] and seven predicted N-myristylation sites (PS00008) from about amino acids 29 to 34, 85 to 90, 103 to 108, 223 to 228, 237 to 242, 325 to 330, and 400 to 405 of SEQ ID NO:1.

[0490] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[0491] A plasmid containing the nucleotide sequence encoding human 61833 (clone “Fbh61833FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[0492] The 61833 protein contains a significant number of structural characteristics in common with members of the pyridoxyl-dependent decarboxylase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[0493] A pyridoxyl-dependent decarboxylase family of proteins can be subdivided into four subgroups: Group I includes glycine decarboxylases; group II includes glutamate, histidine, tyrosine, and aromatic-L-amino acid decarboxylases; group III includes prokaryotic ornithine and lysine decarboxylases as well as the prokaryotic biodegradative type of decarboxylases; and group IV includes eukaryotic ornithine and arginine decarboxylases as well as the prokaryotic biosynthetic type of arginine decarboxylases. These four families have been described in, e.g., Sandmeier et al. (1994), Eur J Biochem 221(3): 997-1002, the contents of which are incorporated herein by reference. 61833 polypeptides have structural characteristics in common with the group IV pyridoxyl-dependent decarboxylases.

[0494] A 61833 polypeptide can include a “pyridoxyl-dependent decarboxylase domain” or regions homologous with a “pyridoxyl-dependent decarboxylase domain”.

[0495] As used herein, the term “pyridoxyl-dependent decarboxylase domain” includes an amino acid sequence of about 250 to 500 amino acid residues in length and having a bit score for the alignment of the sequence to the pyridoxyl-dependent decarboxylase domain profile (Pfam HMM) of at least 215. Preferably, a pyridoxyl-dependent decarboxylase domain includes at least about 300 to 450 amino acids, more preferably about 325 to 425 amino acid residues, or about 350 to 400 amino acids and has a bit score for the alignment of the sequence to the pyridoxyl-dependent decarboxylase domain (HMM) of at least 250, 300, 350, 375, 400, 425 or greater. The pyridoxyl-dependent decarboxylase domain (HMM) has been assigned the PFAM Accession Number PF00278 (http;//genome.wustl.edu/Pfam/.html). An alignment of the pyridoxyl-dependent decarboxylase domain (amino acids 41 to 401 of SEQ ID NO:11) of human 61833 with a consensus amino acid sequence (SEQ ID NO:13) derived from a hidden Markov model is depicted in FIG. 10.

[0496] In a preferred embodiment 61833 polypeptide or protein has a “pyridoxyl-dependent decarboxylase domain” or a region which includes at least about 250 to 500, more preferably about 325 to 425, or 350 to 400 amino acid residues and has at least about 50%, 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “pyridoxyl-dependent decarboxylase domain,” e.g., the pyridoxyl-dependent decarboxylase domain of human 61833 (e.g., residues 41 to 401 of SEQ ID NO:11).

[0497] To identify the presence of a “pyridoxyl-dependent decarboxylase” domain in a 61833 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against the Pfam database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonharnmer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of a “pyridoxyl-dependent decarboxylase” domain in the amino acid sequence of human 61833 at about residues 41 to 401 of SEQ ID NO:11 (see FIG. 10).

[0498] In one embodiment, a 61833 protein includes at least one decarboxylase family 2 pyridoxyl phosphate binding site, located at about amino acid residues 67 to 85 of SEQ ID NO: 11. As used herein, the term “decarboxylase family 2 pyridoxyl phosphate binding site” includes a sequence of at least 12 amino acid residues defined by the sequence: (F/Y)-(P/A)-X-K-(S/A/C/V)-(N/H/C/L/F/W)-X-X-X-X-(L/I/V/M/F)-(L/I/V/M/T/A)-X-X-(L/I/V/M/A)-X-X-X-(G/T/E) (SEQ ID NO: 14). A decarboxylase family 2 pyridoxyl phosphate binding site, as defined, can be involved in the binding of pyridoxyl -5′-phosphate, as well as the removal of a carboxyl group from an appropriate substrate, e.g. ornithine or an amino acid, e.g., arginine. More preferably, a decarboxylase family 2 pyridoxyl phosphate binding site includes 15, or even more preferably 19, amino acid residues which encompass the lysine (K) residue of position 4. Decarboxylase family 2 pyridoxyl phosphate binding sites have been described in, e.g., the PROSITE database (www.expasy.org; PS00878), the contents of which are hereby incorporated by reference.

[0499] In one embodiment, a 61833 protein includes at least one decarboxylase family 2 signature sequence, located at about amino acid residues 227 to 241 of SEQ ID NO: 11. As used herein, the term “decarboxylase family 2 signature sequence” includes a sequence of at least 8 amino acid residues defined by the sequence: (G/S)-X-X-(L/I/V/M/S/C/P)-X-X-(L/I/V/M/F)-(D/N/S)-(L/I/V/M/C/A)-G-G-G-(L/I/V/M/F/Y)-(G/S/T/P/C/E/Q) (SEQ ID NO: 15). A decarboxylase family 2 signature sequence, as defined, can be involved in the binding of a substrate molecule, e.g., omithine or an amino acid, e.g., arginine, or a pyridoxyl-dependent decarboxylase inhibitor, e.g., a-difluoromethylornithine, as well as the removal of a carboxyl group from an appropriate substrate, e.g. ornithine or an amino acid, e.g., arginine. More preferably, a decarboxylase family 2 signature motif includes 11, or even more preferably 14, amino acid residues which encompass the glycine (G) residues of positions 10, 11, and 12. Decarboxylase family 2 signature motifs have been described in, e.g., the PROSITE database (www.expasy.org; PS00879), the contents of which are hereby incorporated by reference.

[0500] A 61833 family member can include at least one pyridoxyl-dependent decarboxylase domain. Furthermore, a 61833 family member can include: at least one decarboxylase family 2 pyridoxyl phosphate binding site that has the capacity to bind pyridoxyl 5′ phosphate; at least one decarboxylase family 2 signature motif that has the capcity to bind to a substrate molecule; at least one, two, three, four, five, preferably six predicted protein kinase C phosphorylation sites (PS00005); at least one, two, three, four, and preferably five predicted casein kinase II phosphorylation sites (PS00006); at least one predicted cAMP/cGMP-dependent protein kinase phosphorylation site (PS0004); at least one predicted tyrosine kinase phosphorylation site (PS00007); and at least one, two, three, four, five, six, and preferably seven predicted N-myristylation sites (PS00008).

[0501] As the 61833 polypeptides of the invention may modulate 61833-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 61833-mediated or related disorders, as described below.

[0502] As used herein, a “61833 activity”, “biological activity of 61833” or “functional activity of 61833”, refers to an activity exerted by a 61833 protein, polypeptide or nucleic acid molecule. For example, a 61833 activity can be an activity exerted by 61833 in a physiological milieu on, e.g., a 61833-responsive cell or on a 61833 substrate, e.g., a protein substrate. A 61833 activity can be determined in vivo or in vitro. In one embodiment, a 61833 activity is a direct activity, such as an association with a 61833 target molecule. A “target molecule” or “binding partner” is a molecule with which a 61833 protein binds or interacts in nature.

[0503] In an exemplary embodiment, 61833 is an enzyme that removes carboxyl groups from appropriate substrate molecules, e.g., omithine or amino acids, e.g., arginine. As used herein, a “pyridoxyl-dependent decarboxylase activity” refers to an activity that catalyzes the removal of a carboxyl group from an appropriate substrate.

[0504] A 61833 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 61833 protein with a 61833 receptor. The features of the 61833 molecules of the present invention can provide similar biological activities as pyridoxyl-dependent decarboxylase family members. For example, the 61833 proteins of the present invention can have one or more of the following activities: (1) pyridoxyl-dependent decarboxylase activity, e.g., for the removal of a carboxyl group from omithine or an amino acid, e.g., arginine; (2) stimulation of polyamine levels within a cell, e.g., polyamines such as putrescine, spermidine, and spermine; (3) stimulation of cell growth and proliferation; (4) inhibition of cell death; (5) inhibition of platelet aggregation; (6) inhibition of atherosclerotic plaque formation; (7) modulation of voltage-gated sodium channels; or (8) modulation of neuronal behavior. Thus, the 61833 molecules can act as novel diagnostic targets and therapeutic agents for controlling (1) cellular proliferative or differentiative disorders, (2) cardiovascular disorders, and/or (3) disorders of the brain.

[0505] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[0506] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth, i.e., an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[0507] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[0508] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[0509] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[0510] Examples of cellular proliferative and/or differentiative disorders of the colon include, but are not limited to, non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors.

[0511] Examples of cellular proliferative and/or differentiative disorders of the liver include, but are not limited to, nodular hyperplasias, adenomas, and malignant tumors, including primary carcinoma of the liver and metastatic tumors.

[0512] Examples of cellular proliferative and/or differentiative disorders of the breast include, but are not limited to, proliferative breast disease including, e.g., epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors, e.g., stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[0513] Examples of cellular proliferative and/or differentiative disorders of the lung include, but are not limited to, bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[0514] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin, e.g., arising from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[0515] Further examples of cancers or neoplastic conditions, in addition to the ones described above, include, but are not limited to, a fibrosarcoma, myosarcoma, liposarcoma, chondrosarcoma, osteogenic sarcoma, chordoma, angiosarcoma, endotheliosarcoma, lymphangiosarcoma, lymphangioendotheliosarcoma, synovioma, mesothelioma, Ewing's tumor, leiomyosarcoma, rhabdomyosarcoma, gastric cancer, esophageal cancer, rectal cancer, pancreatic cancer, ovarian cancer, prostate cancer, uterine cancer, cancer of the head and neck, skin cancer, brain cancer, squamous cell carcinoma, sebaceous gland carcinoma, papillary carcinoma, papillary adenocarcinoma, cystadenocarcinoma, medullary carcinoma, bronchogenic carcinoma, renal cell carcinoma, hepatoma, bile duct carcinoma, choriocarcinoma, seminoma, embryonal carcinoma, Wilm's tumor, cervical cancer, testicular cancer, small cell lung carcinoma, non-small cell lung carcinoma, bladder carcinoma, epithelial carcinoma, glioma, astrocytoma, medulloblastoma, craniopharyngioma, ependymoma, pinealoma, hemangioblastoma, acoustic neuroma, oligodendroglioma, meningioma, melanoma, neuroblastoma, retinoblastoma, leukemia, lymphoma, or Kaposi sarcoma. Many such neoplastic conditions can progress to a metastatic state, e.g., resulting in tumor cells moving to new locations and forming metastatic tumors. The motility of such cells can depend on extracellular ligands, e.g., a ligand that is synthesized by a 61833 polypeptide.

[0516] 61833 molecules can also be useful as novel diagnostic targets and therapeutic agents for cardiovascular diseases. Examples of disorders involving the heart or “cardiovascular disorder” include, but are not limited to, a disease, disorder, or state involving the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. Examples of such disorders include hypertension, atherosclerosis, coronary artery spasm, congestive heart failure, coronary artery disease, valvular disease, arrhythmias, and cardiomyopathies.

[0517] Disorders involving blood vessels include, but are not limited to, responses of vascular cell walls to injury, such as endothelial dysfunction and endothelial activation and intimal thickening; vascular diseases including, but not limited to, congenital anomalies, such as arteriovenous fistula, atherosclerosis, and hypertensive vascular disease, such as hypertension; inflammatory disease—the vasculitides, such as giant cell (temporal) arteritis, Takayasu arteritis, polyarteritis nodosa (classic), Kawasaki syndrome (mucocutaneous lymph node syndrome), microscopic polyanglitis (microscopic polyarteritis, hypersensitivity or leukocytoclastic anglitis), Wegener granulomatosis, thromboanglitis obliterans (Buerger disease), vasculitis associated with other disorders, and infectious arteritis; Raynaud disease; aneurysms and dissection, such as abdominal aortic aneurysms, syphilitic (luetic) aneurysms, and aortic dissection (dissecting hematoma); disorders of veins and lymphatics, such as varicose veins, thrombophlebitis and phlebothrombosis, obstruction of superior vena cava (superior vena cava syndrome), obstruction of inferior vena cava (inferior vena cava syndrome), and lymphangitis and lymphedema; tumors, including benign tumors and tumor-like conditions, such as hemangioma, lymphangioma, glomus tumor (glomangioma), vascular ectasias, and bacillary angiomatosis, and intermediate-grade (borderline low-grade malignant) tumors, such as Kaposi sarcoma and hemangloendothelioma, and malignant tumors, such as angiosarcoma and hemangiopericytoma; and pathology of therapeutic interventions in vascular disease, such as balloon angioplasty and related techniques and vascular replacement, such as coronary artery bypass graft surgery.

[0518] Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B1) deficiency and vitamin B12 deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[0519] The 61833 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of other conditions, in addition to the ones described above (see “Methods of Treatment” for additional examples).

[0520] The 61833 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO: 11 thereof are collectively referred to as “polypeptides or proteins of the invention” or “61833 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “61833 nucleic acids.” 61833 molecules refer to 61833 nucleic acids, polypeptides, and antibodies.

[0521] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[0522] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[0523] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[0524] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO: 10 or SEQ ID NO: 12, corresponds to a naturally-occurring nucleic acid molecule.

[0525] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein.

[0526] As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding a 61833 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 61833 protein or derivative thereof.

[0527] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 61833 protein is at least 10% pure. In a preferred embodiment, the preparation of 61833 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-61833 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-61833 chemicals. When the 61833 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[0528] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 61833 without abolishing or substantially altering a 61833 activity. Preferably the alteration does not substantially alter the 61833 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 61833, results in abolishing a 61833 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 61833 are predicted to be particularly unamenable to alteration.

[0529] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 61833 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 61833 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 61833 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:10 or SEQ ID NO:12, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[0530] As used herein, a “biologically active portion” of a 61833 protein includes a fragment of a 61833 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 61833 molecule and a non-61833 molecule or between a first 61833 molecule and a second 61833 molecule (e.g., a dimerization interaction). Biologically active portions of a 61833 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 61833 protein, e.g., the amino acid sequence shown in SEQ ID NO:11, which include less amino acids than the full length 61833 proteins, and exhibit at least one activity of a 61833 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 61833 protein, e.g., pyridoxyl-dependent decarboxylase activity. A biologically active portion of a 61833 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 61833 protein can be used as targets for developing agents which modulate a 61833 mediated activity, e.g., pyridoxyl-dependent decarboxylase activity

[0531] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[0532] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[0533] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[0534] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[0535] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[0536] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 61833 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 61833 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[0537] Particularly preferred 61833 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO: 11. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO: 11 are termed substantially identical.

[0538] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:10 or 12 are termed substantially identical.

[0539] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[0540] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[0541] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[0542] Various aspects of the invention are described in further detail below.

[0543] Isolated Nucleic Acid Molecules of 61833

[0544] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 61833 polypeptide described herein, e.g., a full-length 61833 protein or a fragment thereof, e.g., a biologically active portion of 61833 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 61833 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[0545] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO: 10, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 61833 protein (i.e., “the coding region” of SEQ ID NO:10, as shown in SEQ ID NO: 12), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:10 (e.g., SEQ ID NO: 12) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein, from about amino acid 41 to 401 of SEQ ID NO:11.

[0546] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO: 10 or SEQ ID NO: 12, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO: 10 or SEQ ID NO: 12, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:10 or 12, thereby forming a stable duplex.

[0547] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO: 10 or SEQ ID NO: 12, or a portion, preferably of the same length, of any of these nucleotide sequences.

[0548] 61833 Nucleic Acid Fragments

[0549] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:10 or 12. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of a 61833 protein, e.g., an immunogenic or biologically active portion of a 61833 protein. A fragment can comprise those nucleotides of SEQ ID NO: 10, which encode a pyridoxyl-dependent decarboxylase domain of human 61833. The nucleotide sequence determined from the cloning of the 61833 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 61833 family members, or fragments thereof, as well as 61833 homologues, or fragments thereof, from other species.

[0550] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 100, 200, more preferably 300, most preferably 350 or more amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[0551] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 61833 nucleic acid fragment can include a sequence corresponding to a pyridoxyl-dependent decarboxylase domain.

[0552] 61833 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:10 or SEQ ID NO:12, or of a naturally occurring allelic variant or mutant of SEQ ID NO:10 or SEQ ID NO:12.

[0553] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0554] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes the pyridoxyl-dependent decarboxylase domain, e.g., about nucleotides 442 to 1524 of SEQ ID NO: 10, or any other domain, region, or sequence described herein of human 61833.

[0555] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 61833 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a pyridoxyl-dependent decarboxylase domain from about amino acid 41 to 401 of SEQ ID NO:11.

[0556] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[0557] A nucleic acid fragment encoding a “biologically active portion of a 61833 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:10 or 12, which encodes a polypeptide having a 61833 biological activity (e.g., the biological activities of the 61833 proteins are described herein), expressing the encoded portion of the 61833 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 61833 protein. For example, a nucleic acid fragment encoding a biologically active portion of 61833 includes a pyridoxyl-dependent decarboxylase domain, e.g., amino acid residues about 41 to 401 of SEQ ID NO: 11. A nucleic acid fragment encoding a biologically active portion of a 61833 polypeptide, may comprise a nucleotide sequence which is greater than 300 or more nucleotides in length.

[0558] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1383 or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO: 10, or SEQ ID NO: 12. In a preferred embodiment, a nucleic acid includes at least one contiguous nucleotide from the region of about nucleotides 1 to 200, 100 to 250, 200 to 350, 322 to 441, 322 to 576, 442 to 576, 520 to 800, 520 to 1041, 800 to 1041, 1000 to 1200, 1000 to 1524, 1200 to 1524, 1525 to 1701, or 1700 to 1900 of SEQ ID NO:10.

[0559] 61833 Nucleic Acid Variants

[0560] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:10 or SEQ ID NO:12. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 61833 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:11. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0561] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[0562] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[0563] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:10 or 12, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0564] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO: 11 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO: 11 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 61833 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 61833 gene.

[0565] Preferred variants include those that are correlated with pyridoxyl-dependent decarboxylase activity.

[0566] Allelic variants of 61833, e.g., human 61833, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 61833 protein within a population that maintain the ability to bind pyridoxyl -5′-phosphate and a substrate molecule, e.g., omithine or an amino acid, e.g., arginine. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:11, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 61833, e.g., human 61833, protein within a population that do not have the ability to catalyze the pyridoxyl -5′-phosphate-dependent removal of a carboxyl moiety from a substrate molecule, e.g., non-functional allelic variants may lack the ability to bind pyridoxyl -5′-phosphate or a substrate molecule, or they may lack decarboxylase activity. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO: 11, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[0567] Moreover, nucleic acid molecules encoding other 61833 family members and, thus, which have a nucleotide sequence which differs from the 61833 sequences of SEQ ID NO:10 or SEQ ID NO: 12 are intended to be within the scope of the invention.

[0568] Antisense Nucleic Acid Molecules, Ribozymes and Modified 61833 Nucleic Acid Molecules

[0569] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 61833. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 61833 coding strand, or to only a portion thereof (e.g., the coding region of human 61833 corresponding to SEQ ID NO:12). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 61833 (e.g., the 5′ and 3′untranslated regions).

[0570] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 61833 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 61833 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 61833 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[0571] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[0572] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 61833 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[0573] In yet another embodiment, the antisense nucleic acid molecule of the invention is an α-anomeric nucleic acid molecule. An α-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual β-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[0574] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 61833-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 61833 cDNA disclosed herein (i.e., SEQ ID NO:10 or SEQ ID NO:12), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 61833-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 61833 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[0575] 61833 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 61833 (e.g., the 61833 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 61833 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[0576] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or colorimetric.

[0577] A 61833 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[0578] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[0579] PNAs of 61833 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 61833 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[0580] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[0581] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 61833 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 61833 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[0582] Isolated 61833 Polypeptides

[0583] In another aspect, the invention features, an isolated 61833 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-61833 antibodies. 61833 protein can be isolated from cells or tissue sources using standard protein purification techniques. 61833 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[0584] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[0585] In a preferred embodiment, a 61833 polypeptide has one or more of the following characteristics:

[0586] (i) it has the ability to decarboxylate a substrate molecule in a pyridoxyl-dependent fashion;

[0587] (ii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of a 61833 polypeptide, e.g., a polypeptide of SEQ ID NO:11;

[0588] (iii) it has an overall sequence similarity of at least 60%, preferably at least 70, 80, 90, or 95%, with a polypeptide of SEQ ID NO: 11;

[0589] (iv) it has a pyridoxyl-dependent decarboxylase domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues about 41 to 401 of SEQ ID NO:11;

[0590] (v) it has a decarboxylase family 2 pyridoxyl phosphate attachment site, located at about amino acid residues 67 to 85 of SEQ ID NO:11; or

[0591] (vi) it has a decarboxylase family 2 signature motif, located at about amino acid residues 228 to 241.

[0592] In a preferred embodiment the 61833 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:11. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO: 11 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO: 11. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the pyridoxyl-dependent decarboxylase domain, e.g., about amino acid residues 41 to 401 of SEQ ID NO:11. In another preferred embodiment one or more differences are in the pyridoxyl-dependent decarboxylase domain.

[0593] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 61833 proteins differ in amino acid sequence from SEQ ID NO:11, yet retain biological activity.

[0594] In one embodiment, the protein includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:11.

[0595] A 61833 protein or fragment is provided which varies from the sequence of SEQ ID NO: 11 in regions defined by amino acids about 1 to 460 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO: 11 in regions defined about by amino acid residues 67 to 85 and 228 to 241 of SEQ ID NO: 11 (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[0596] In one embodiment, a biologically active portion of a 61833 protein includes a pyridoxyl-dependent decarboxylase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 61833 protein.

[0597] In a preferred embodiment, the 61833 protein has an amino acid sequence shown in SEQ ID NO: 11. In other embodiments, the 61833 protein is substantially identical to SEQ ID NO:11. In yet another embodiment, the 61833 protein is substantially identical to SEQ ID NO: 11 and retains the functional activity of the protein of SEQ ID NO:11, as described in detail in the subsections above.

[0598] 61833 Chimeric or Fusion Proteins

[0599] In another aspect, the invention provides 61833 chimeric or fusion proteins. As used herein, a 61833 “chimeric protein” or “fusion protein” includes a 61833 polypeptide linked to a non-61833 polypeptide. A “non-61833 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 61833 protein, e.g., a protein which is different from the 61833 protein and which is derived from the same or a different organism. The 61833 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 61833 amino acid sequence. In a preferred embodiment, a 61833 fusion protein includes at least one (or two) biologically active portion of a 61833 protein. The non-61833 polypeptide can be fused to the N-terminus or C-terminus of the 61833 polypeptide.

[0600] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-61833 fusion protein in which the 61833 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 61833. Alternatively, the fusion protein can be a 61833 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 61833 can be increased through use of a heterologous signal sequence.

[0601] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[0602] The 61833 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 61833 fusion proteins can be used to affect the bioavailability of a 61833 substrate. 61833 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 61833 protein; (ii) mis-regulation of the 61833 gene; and (iii) aberrant post-translational modification of a 61833 protein.

[0603] Moreover, the 61833-fusion proteins of the invention can be used as immunogens to produce anti-61833 antibodies in a subject, to purify 61833 ligands and in screening assays to identify molecules which inhibit the interaction of 61833 with a 61833 substrate.

[0604] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 61833-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 61833 protein.

[0605] Variants of 61833 Proteins

[0606] In another aspect, the invention also features a variant of a 61833 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 61833 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 61833 protein. An agonist of the 61833 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 61833 protein. An antagonist of a 61833 protein can inhibit one or more of the activities of the naturally occurring form of the 61833 protein by, for example, competitively modulating a 61833-mediated activity of a 61833 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 61833 protein.

[0607] Variants of a 61833 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 61833 protein for agonist or antagonist activity.

[0608] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 61833 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 61833 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[0609] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 61833 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 61833 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[0610] Cell based assays can be exploited to analyze a variegated 61833 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 61833 in a substrate-dependent manner. The transfected cells are then contacted with 61833 and the effect of the expression of the mutant on signaling by the 61833 substrate can be detected, e.g., by measuring the levels of a 61833 product, e.g., a polyamin, e.g., putrescine or spermidine. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 61833 substrate, and the individual clones further characterized.

[0611] In another aspect, the invention features a method of making a 61833 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 61833 polypeptide, e.g., a naturally occurring 61833 polypeptide. The method includes: altering the sequence of a 61833 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[0612] In another aspect, the invention features a method of making a fragment or analog of a 61833 polypeptide a biological activity of a naturally occurring 61833 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 61833 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[0613] Anti-61833 Antibodies

[0614] In another aspect, the invention provides an anti-61833 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[0615] The anti-61833 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[0616] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[0617] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 61833 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-61833 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)₂ fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[0618] The anti-61833 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[0619] Phage display and combinatorial methods for generating anti-61833 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[0620] In one embodiment, the anti-61833 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[0621] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[0622] An anti-61833 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[0623] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[0624] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 61833 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[0625] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[0626] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 61833 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[0627] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[0628] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[0629] In preferred embodiments an antibody can be made by immunizing with purified 61833 antigen, or a fragment thereof, e.g., a fragment described herein, tissue, e.g., crude tissue preparations, whole cells, preferably living cells, lysed cells, or cell fractions, e.g., cytosolic fractions.

[0630] A full-length 61833 protein or, antigenic peptide fragment of 61833 can be used as an immunogen or can be used to identify anti-61833 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 61833 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO: 1 and encompasses an epitope of 61833. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[0631] Fragments of 61833 which include residues about 30 to 50, about 276 to 293, or about 323 to 341 can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody, antibodies against hydrophobic regions of the 61833 protein. Similarly, fragments of 61833 which include residues about 300 to 310 can be used to make an antibody against a hydrophilic region of the 61833 protein; a fragment of 61833 which include residues about 128 to 150, about 170 to 205, or about 370 to 401 can be used to make an antibody against the pyridoxyl-dependent decarboxylase region of the 61833 protein.

[0632] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[0633] Antibodies which bind only native 61833 protein, only denatured or otherwise non-native 61833 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 61833 protein.

[0634] Preferred epitopes encompassed by the antigenic peptide are regions of 61833 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 61833 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 61833 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[0635] In preferred embodiments antibodies can bind one or more of purified antigen or tissue, e.g., tissue sections, whole cells, preferably living cells, lysed cells, cell fractions, e.g., cytosolic fractions.

[0636] The anti-61833 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 61833 protein.

[0637] In a preferred embodiment the antibody has: effector function; and can fix complement. In other embodiments the antibody does not; recruit effector cells; or fix complement.

[0638] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example., it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[0639] In a preferred embodiment, an anti-61833 antibody alters (e.g., increases or decreases) the pyridoxyl-dependent decarboxylase activity of a 61833 polypeptide. For example, the antibody can bind at or in proximity to a pyridoxyl -5′-phosphate- or a substrate-binding site, e.g., to an epitope that includes a residue located from about 67 to 85 or 228 to 241 of SEQ ID NO:1.

[0640] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[0641] An anti-61833 antibody (e.g., monoclonal antibody) can be used to isolate 61833 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-61833 antibody can be used to detect 61833 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-61833 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125I, ¹³¹I, ³⁵S or ³H.

[0642] The invention also includes a nucleic acids which encodes an anti-61833 antibody, e.g., an anti-61833 antibody described herein. Also included are vectors which include the nucleic acid and sells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[0643] The invention also includes cell lines, e.g., hybridomas, which make an anti-61833 antibody, e.g., and antibody described herein, and method of using said cells to make a 61833 antibody.

[0644] 61833 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[0645] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[0646] A vector can include a 61833 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 61833 proteins, mutant forms of 61833 proteins, fusion proteins, and the like).

[0647] The recombinant expression vectors of the invention can be designed for expression of 61833 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[0648] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[0649] Purified fusion proteins can be used in 61833 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 61833 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[0650] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[0651] The 61833 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[0652] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[0653] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[0654] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the α-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[0655] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[0656] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 61833 nucleic acid molecule within a recombinant expression vector or a 61833 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[0657] A host cell can be any prokaryotic or eukaryotic cell. For example, a 61833 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are known to those skilled in the art.

[0658] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[0659] A host cell of the invention can be used to produce (i.e., express) a 61833 protein. Accordingly, the invention further provides methods for producing a 61833 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 61833 protein has been introduced) in a suitable medium such that a 61833 protein is produced. In another embodiment, the method further includes isolating a 61833 protein from the medium or the host cell.

[0660] In another aspect, the invention features, a cell or purified preparation of cells which include a 61833 transgene, or which otherwise misexpress 61833. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 61833 transgene, e.g., a heterologous form of a 61833, e.g., a gene derived from humans (in the case of a non-human cell). The 61833 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 61833, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 61833 alleles or for use in drug screening.

[0661] In another aspect, the invention features, a human cell, e.g., a hematopoietic stem cell, transformed with nucleic acid which encodes a subject 61833 polypeptide.

[0662] Also provided are cells, preferably human cells, e.g., human hematopoietic or fibroblast cells, in which an endogenous 61833 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 61833 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 61833 gene. For example, an endogenous 61833 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[0663] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 61833 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of a 61833 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 61833 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[0664] 61833 Transgenic Animals

[0665] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 61833 protein and for identifying and/or evaluating modulators of 61833 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 61833 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[0666] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 61833 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 61833 transgene in its genome and/or expression of 61833 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 61833 protein can further be bred to other transgenic animals carrying other transgenes.

[0667] 61833 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[0668] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[0669] Uses of 61833

[0670] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[0671] The isolated nucleic acid molecules of the invention can be used, for example, to express a 61833 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 61833 mRNA (e.g., in a biological sample) or a genetic alteration in a 61833 gene, and to modulate 61833 activity, as described further below. The 61833 proteins can be used to treat disorders characterized by insufficient or excessive production of a 61833 substrate or production of 61833 inhibitors. In addition, the 61833 proteins can be used to screen for naturally occurring 61833 substrates, to screen for drugs or compounds which modulate 61833 activity, as well as to treat disorders characterized by insufficient or excessive production of 61833 protein or production of 61833 protein forms which have decreased, aberrant or unwanted activity compared to 61833 wild type protein (e.g., cellular proliferative or differentiative disorder, a cardiovascular disorder, or a disorder of the brain). Moreover, the anti-61833 antibodies of the invention can be used to detect and isolate 61833 proteins, regulate the bioavailability of 61833 proteins, and modulate 61833 activity.

[0672] The 61833 polypeptide of the invention can also be used in a method of modifying a compound, e.g., a method of decarboxylating a substrate compound. The method includes: providing a compound of interest, e.g., omithine or an amino acid, e.g., agrinine; a reaction-compatible solvent; combining the compound of interest, a 61833 polypeptide described herein, and pyridoxyl -5′-phosphate in the solvent; and maintaining the solvent mixture under conditions such that a carboxyl moiety is removed from the compound of interest. The method can further include isolating the compound of interest from the solvent. Such a method can be useful for synthesizing compounds, e.g., putrescine, in vitro for, e.g., laboratory or pharmaceutical use.

[0673] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 61833 polypeptide is provided. The method includes: contacting the compound with the subject 61833 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 61833 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 61833 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 61833 polypeptide. Screening methods are discussed in more detail below.

[0674] 61833 Screening Assays

[0675] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 61833 proteins, have a stimulatory or inhibitory effect on, for example, 61833 expression or 61833 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 61833 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 61833 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[0676] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of a 61833 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 61833 protein or polypeptide or a biologically active portion thereof.

[0677] In one embodiment, a 61833 protein can be purified from an animal tissue sample and then its activity can be assayed in vitro, as described in, e.g., Seely and Pegg (1983), Methods Enzymol. 94: 158-161, the contents of which are incorporated herein by reference.

[0678] Bacterially expressed, recombinant 61833 protein can be purified and assayed in vitro for activity as described, e.g., in Poulin et al. (1992), J Biol. Chem. 267: 150-158, the contents of which are incorporated herein by reference.

[0679] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[0680] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[0681] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[0682] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 61833 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 61833 activity is determined. Determining the ability of the test compound to modulate 61833 activity can be accomplished by monitoring, for example, the generation of a 61833 product, e.g., putrecine. The cell, for example, can be of mammalian origin, e.g., human

[0683] The ability of the test compound to modulate 61833 binding to a compound, e.g., a 61833 substrate, or to bind to 61833 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 61833 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 61833 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 61833 binding to a 61833 substrate in a complex. For example, compounds (e.g., 61833 substrates) can be labeled with ¹²⁵I, ³⁵S, ¹⁴C, or ³H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[0684] The ability of a compound (e.g., a 61833 substrate) to interact with 61833 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 61833 without the labeling of either the compound or the 61833. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 61833.

[0685] In yet another embodiment, a cell-free assay is provided in which a 61833 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 61833 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 61833 proteins to be used in assays of the present invention include fragments which participate in interactions with non-61833 molecules, e.g., fragments with high surface probability scores.

[0686] Soluble and/or membrane-bound forms of isolated proteins (e.g., 61833 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)_(n), 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[0687] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[0688] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[0689] In another embodiment, determining the ability of the 61833 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[0690] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[0691] It may be desirable to immobilize either 61833, an anti-61833 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 61833 protein, or interaction of a 61833 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/61833 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 61833 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 61833 binding or activity determined using standard techniques.

[0692] Other techniques for immobilizing either a 61833 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 61833 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[0693] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[0694] In one embodiment, this assay is performed utilizing antibodies reactive with 61833 protein or target molecules but which do not interfere with binding of the 61833 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 61833 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 61833 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 61833 protein or target molecule.

[0695] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11:141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[0696] In a preferred embodiment, the assay includes contacting the 61833 protein or biologically active portion thereof with a known compound which binds 61833 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 61833 protein, wherein determining the ability of the test compound to interact with a 61833 protein includes determining the ability of the test compound to preferentially bind to 61833 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[0697] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 61833 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 61833 protein through modulation of the activity of a downstream effector of a 61833 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[0698] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[0699] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[0700] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[0701] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[0702] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[0703] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[0704] In yet another aspect, the 61833 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 61833 (“61833-binding proteins” or “61833-bp”) and are involved in 61833 activity. Such 61833-bps can be activators or inhibitors of signals by the 61833 proteins or 61833 targets as, for example, downstream elements of a 61833-mediated signaling pathway.

[0705] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 61833 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 61833 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 61833-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 61833 protein.

[0706] In another embodiment, modulators of 61833 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 61833 mRNA or protein evaluated relative to the level of expression of 61833 mRNA or protein in the absence of the candidate compound. When expression of 61833 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 61833 mRNA or protein expression. Alternatively, when expression of 61833 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 61833 mRNA or protein expression. The level of 61833 mRNA or protein expression can be determined by methods described herein for detecting 61833 mRNA or protein.

[0707] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 61833 protein can be confirmed in vivo, e.g., in an animal such as an animal model for excessive or abnormal cellular proliferation or differentiation, or an animal having a cardiovascular or a neurological disorder.

[0708] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 61833 modulating agent, an antisense 61833 nucleic acid molecule, a 61833-specific antibody, or a 61833-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[0709] 61833 Detection Assays

[0710] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 61833 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[0711] 61833 Chromosome Mapping

[0712] The 61833 nucleotide sequences or portions thereof can be used to map the location of the 61833 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 61833 sequences with genes associated with disease.

[0713] Briefly, 61833 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 61833 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 61833 sequences will yield an amplified fragment.

[0714] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[0715] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 61833 to a chromosomal location.

[0716] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[0717] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[0718] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[0719] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 61833 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[0720] 61833 Tissue Typing

[0721] 61833 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[0722] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 61833 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[0723] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:10 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO: 12 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[0724] If a panel of reagents from 61833 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[0725] Use of Partial 61833 Sequences in Forensic Biology

[0726] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[0727] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:10 (e.g., fragments derived from the noncoding regions of SEQ ID NO: 10 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[0728] The 61833 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 61833 probes can be used to identify tissue by species and/or by organ type.

[0729] In a similar fashion, these reagents, e.g., 61833 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[0730] Predictive Medicine of 61833

[0731] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[0732] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 61833.

[0733] Such disorders include, e.g., a disorder associated with the misexpression of 61833 gene.

[0734] The method includes one or more of the following:

[0735] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 61833 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[0736] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 61833 gene;

[0737] detecting, in a tissue of the subject, the misexpression of the 61833 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[0738] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 61833 polypeptide.

[0739] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 61833 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[0740] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO: 10, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 61833 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[0741] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 61833 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 61833.

[0742] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[0743] In preferred embodiments the method includes determining the structure of a 61833 gene, an abnormal structure being indicative of risk for the disorder.

[0744] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 61833 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[0745] Diagnostic and Prognostic Assays of 61833

[0746] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 61833 molecules and for identifying variations and mutations in the sequence of 61833 molecules.

[0747] Expression Monitoring and Profiling:

[0748] The presence, level, or absence of 61833 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 61833 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 61833 protein such that the presence of 61833 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 61833 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 61833 genes; measuring the amount of protein encoded by the 61833 genes; or measuring the activity of the protein encoded by the 61833 genes.

[0749] The level of mRNA corresponding to the 61833 gene in a cell can be determined both by in situ and by in vitro formats.

[0750] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 61833 nucleic acid, such as the nucleic acid of SEQ ID NO: 10, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 61833 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[0751] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 61833 genes.

[0752] The level of mRNA in a sample that is encoded by one of 61833 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[0753] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 61833 gene being analyzed.

[0754] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 61833 mRNA, or genomic DNA, and comparing the presence of 61833 mRNA or genomic DNA in the control sample with the presence of 61833 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 61833 transcript levels.

[0755] A variety of methods can be used to determine the level of protein encoded by 61833. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)₂) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[0756] The detection methods can be used to detect 61833 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 61833 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 61833 protein include introducing into a subject a labeled anti-61833 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-61833 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[0757] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 61833 protein, and comparing the presence of 61833 protein in the control sample with the presence of 61833 protein in the test sample.

[0758] The invention also includes kits for detecting the presence of 61833 in a biological sample. For example, the kit can include a compound or agent capable of detecting 61833 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 61833 protein or nucleic acid.

[0759] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[0760] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[0761] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 61833 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as atherosclerosis or deregulated cell proliferation.

[0762] In one embodiment, a disease or disorder associated with aberrant or unwanted 61833 expression or activity is identified. A test sample is obtained from a subject and 61833 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 61833 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 61833 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[0763] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 61833 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a cellular proliferative and/or differentiative disorder, a cardiovascular disorder, or a disorder of the brain.

[0764] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 61833 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 61833 (e.g., other genes associated with a 61833-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[0765] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 61833 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a proliferative and/or differentiative cellular disorder, or a disorder of the brain, in a subject wherein an increase in 61833 expression is an indication that the subject has or is disposed to having a proliferative and/or differentiative cellular disorder or a disorder of the brain. Alternatively, the method can be used to diagnose a cardiovascular disorder, e.g., atherosclerosis, in a subject wherein a decrease in 61833 expression is an indication that the subject has or is disposed to having a cardiovascular disorder. The method can be used to monitor a treatment for a proliferative and/or differentiative cellular disorder, a cardiovascular disorder, or a disorder of the brain in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[0766] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 61833 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[0767] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 61833 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[0768] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[0769] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 61833 expression.

[0770] 61833 Arrays and Uses Thereof

[0771] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 61833 molecule (e.g., a 61833 nucleic acid or a 61833 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm², and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[0772] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 61833 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 61833. Each address of the subset can include a capture probe that hybridizes to a different region of a 61833 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 61833 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 61833 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 61833 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[0773] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[0774] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 61833 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 61833 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-61833 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[0775] In another aspect, the invention features a method of analyzing the expression of 61833. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 61833-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[0776] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 61833. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 61833. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[0777] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 61833 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[0778] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[0779] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 61833-associated disease or disorder; and processes, such as a cellular transformation associated with a 61833-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 61833-associated disease or disorder.

[0780] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 61833) that could serve as a molecular target for diagnosis or therapeutic intervention.

[0781] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 61833 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 61833 polypeptide or fragment thereof. For example, multiple variants of a 61833 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[0782] The polypeptide array can be used to detect a 61833 binding compound, e.g., an antibody in a sample from a subject with specificity for a 61833 polypeptide or the presence of a 61833-binding protein or ligand.

[0783] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 61833 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[0784] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 61833 or from a cell or subject in which a 61833 mediated response has been elicited, e.g., by contact of the cell with 61833 nucleic acid or protein, or administration to the cell or subject 61833 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 61833 (or does not express as highly as in the case of the 61833 positive plurality of capture probes) or from a cell or subject which in which a 61833 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 61833 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[0785] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 61833 or from a cell or subject in which a 61833-mediated response has been elicited, e.g., by contact of the cell with 61833 nucleic acid or protein, or administration to the cell or subject 61833 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 61833 (or does not express as highly as in the case of the 61833 positive plurality of capture probes) or from a cell or subject which in which a 61833 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[0786] In another aspect, the invention features a method of analyzing 61833, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 61833 nucleic acid or amino acid sequence; comparing the 61833 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 61833.

[0787] Detection of 61833 Variations or Mutations

[0788] The methods of the invention can also be used to detect genetic alterations in a 61833 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 61833 protein activity or nucleic acid expression, such as a cellular proliferative or differentiative disorder, a cardiovascular disorder, or a disorder of the brain. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 61833-protein, or the mis-expression of the 61833 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 61833 gene; 2) an addition of one or more nucleotides to a 61833 gene; 3) a substitution of one or more nucleotides of a 61833 gene, 4) a chromosomal rearrangement of a 61833 gene; 5) an alteration in the level of a messenger RNA transcript of a 61833 gene, 6) aberrant modification of a 61833 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 61833 gene, 8) a non-wild type level of a 61833-protein, 9) allelic loss of a 61833 gene, and 10) inappropriate post-translational modification of a 61833-protein.

[0789] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 61833-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 61833 gene under conditions such that hybridization and amplification of the 61833-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[0790] In another embodiment, mutations in a 61833 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[0791] In other embodiments, genetic mutations in 61833 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 61833 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 61833 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 61833 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[0792] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 61833 gene and detect mutations by comparing the sequence of the sample 61833 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[0793] Other methods for detecting mutations in the 61833 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[0794] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 61833 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[0795] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 61833 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 61833 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[0796] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[0797] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[0798] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[0799] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 61833 nucleic acid.

[0800] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:10 or the complement of SEQ ID NO:10. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[0801] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 61833. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[0802] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the T_(m) of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[0803] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 61833 nucleic acid.

[0804] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 61833 gene.

[0805] Use of 61833 Molecules as Surrogate Markers

[0806] The 61833 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 61833 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 61833 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[0807] The 61833 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 61833 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-61833 antibodies may be employed in an immune-based detection system for a 61833 protein marker, or 61833-specific radiolabeled probes may be used to detect a 61833 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[0808] The 61833 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 61833 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 61833 DNA may correlate 61833 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[0809] Pharmaceutical Compositions of 61833

[0810] The nucleic acid and polypeptides, fragments thereof, as well as anti-61833 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[0811] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[0812] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fingi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[0813] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[0814] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[0815] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[0816] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[0817] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[0818] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[0819] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[0820] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[0821] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[0822] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[0823] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[0824] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[0825] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[0826] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive metal ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, and puromycin and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine and vinblastine).

[0827] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, α-interferon, β-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[0828] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[0829] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[0830] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[0831] Methods of Treatment for 61833

[0832] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 61833 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[0833] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 61833 molecules of the present invention or 61833 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[0834] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 61833 expression or activity, by administering to the subject a 61833 or an agent which modulates 61833 expression or at least one 61833 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 61833 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 61833 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 61833 aberrance, for example, a 61833, 61833 agonist or 61833 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[0835] It is possible that some 61833 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[0836] The 61833 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, cardiovascular disorders, and disorders of the brain, examples of which have been described above, as well as disorders associated with bone metabolism, immune disorders, liver disorders, viral diseases, pain or metabolic disorders.

[0837] Aberrant expression and/or activity of 61833 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 61833 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 61833 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 61833 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[0838] The 61833 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[0839] Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[0840] Additionally, 61833 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 61833 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 61833 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[0841] Additionally, 61833 may play an important role in the regulation of metabolism or pain disorders. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[0842] As discussed, successful treatment of 61833 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 61833 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)₂ and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[0843] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[0844] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[0845] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 61833 expression is through the use of aptamer molecules specific for 61833 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 61833 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[0846] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 61833 disorders. For a description of antibodies, see the Antibody section above.

[0847] In circumstances wherein injection of an animal or a human subject with a 61833 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 61833 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 61833 protein. Vaccines directed to a disease characterized by 61833 expression may also be generated in this fashion.

[0848] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[0849] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 61833 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[0850] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED₅₀ with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC₅₀ (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography.

[0851] Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 61833 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 61833 can be readily monitored and used in calculations of IC₅₀.

[0852] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC₅₀. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[0853] Another aspect of the invention pertains to methods of modulating 61833 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 61833 or agent that modulates one or more of the activities of 61833 protein activity associated with the cell. An agent that modulates 61833 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 61833 protein (e.g., a 61833 substrate or receptor), a 61833 antibody, a 61833 agonist or antagonist, a peptidomimetic of a 61833 agonist or antagonist, or other small molecule.

[0854] In one embodiment, the agent stimulates one or 61833 activities. Examples of such stimulatory agents include active 61833 protein and a nucleic acid molecule encoding 61833. In another embodiment, the agent inhibits one or more 61833 activities. Examples of such inhibitory agents include antisense 61833 nucleic acid molecules, anti-61833 antibodies, and 61833 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 61833 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 61833 expression or activity. In another embodiment, the method involves administering a 61833 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 61833 expression or activity.

[0855] Stimulation of 61833 activity is desirable in situations in which 61833 is abnormally downregulated and/or in which increased 61833 activity is likely to have a beneficial effect. For example, stimulation of 61833 activity is desirable in situations in which a 61833 is downregulated and/or in which increased 61833 activity is likely to have a beneficial effect. Likewise, inhibition of 61833 activity is desirable in situations in which 61833 is abnormally upregulated and/or in which decreased 61833 activity is likely to have a beneficial effect.

[0856] 61833 Pharmacogenomics

[0857] The 61833 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 61833 activity (e.g., 61833 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 61833 associated disorders (e.g., cellular proliferative or differentiative disorders, cardiovascular disorders, or disorders of the brain) associated with aberrant or unwanted 61833 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 61833 molecule or 61833 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 61833 molecule or 61833 modulator.

[0858] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[0859] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[0860] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 61833 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[0861] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 61833 molecule or 61833 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[0862] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 61833 molecule or 61833 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[0863] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 61833 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 61833 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[0864] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 61833 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 61833 gene expression, protein levels, or upregulate 61833 activity, can be monitored in clinical trials of subjects exhibiting decreased 61833 gene expression, protein levels, or downregulated 61833 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 61833 gene expression, protein levels, or downregulate 61833 activity, can be monitored in clinical trials of subjects exhibiting increased 61833 gene expression, protein levels, or upregulated 61833 activity. In such clinical trials, the expression or activity of a 61833 gene, and preferably, other genes that have been implicated in, for example, a 61833-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[0865] 61833 Informatics

[0866] The sequence of a 61833 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 61833. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 61833 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[0867] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[0868] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[0869] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[0870] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[0871] Thus, in one aspect, the invention features a method of analyzing 61833, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 61833 nucleic acid or amino acid sequence; comparing the 61833 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 61833. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[0872] The method can include evaluating the sequence identity between a 61833 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[0873] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[0874] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[0875] Thus, the invention features a method of making a computer readable record of a sequence of a 61833 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[0876] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 61833 sequence, or record, in machine-readable form; comparing a second sequence to the 61833 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 61833 sequence includes a sequence being compared. In a preferred embodiment the 61833 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 61833 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[0877] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 61833-associated disease or disorder or a pre-disposition to a 61833-associated disease or disorder, wherein the method comprises the steps of determining 61833 sequence information associated with the subject and based on the 61833 sequence information, determining whether the subject has a 61833-associated disease or disorder or a pre-disposition to a 61833-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[0878] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 61833-associated disease or disorder or a pre-disposition to a disease associated with a 61833 wherein the method comprises the steps of determining 61833 sequence information associated with the subject, and based on the 61833 sequence information, determining whether the subject has a 61833-associated disease or disorder or a pre-disposition to a 61833-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 61833 sequence of the subject to the 61833 sequences in the database to thereby determine whether the subject as a 61833-associated disease or disorder, or a pre-disposition for such.

[0879] The present invention also provides in a network, a method for determining whether a subject has a 61833 associated disease or disorder or a pre-disposition to a 61833-associated disease or disorder associated with 61833, said method comprising the steps of receiving 61833 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 61833 and/or corresponding to a 61833-associated disease or disorder (e.g., a cellular proliferative or differentiative disorder, cardiovascular disorder, or a disorder of the brain), and based on one or more of the phenotypic information, the 61833 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 61833-associated disease or disorder or a pre-disposition to a 61833-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[0880] The present invention also provides a method for determining whether a subject has a 61833-associated disease or disorder or a pre-disposition to a 61833-associated disease or disorder, said method comprising the steps of receiving information related to 61833 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 61833 and/or related to a 61833-associated disease or disorder, and based on one or more of the phenotypic information, the 61833 information, and the acquired information, determining whether the subject has a 61833-associated disease or disorder or a pre-disposition to a 61833-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[0881] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 26493 Invention

[0882] The avoidance of mutation by living cells is achieved through several processes operating at different stages of DNA replication. Some of these processes include: selection of the correct deoxyribonucleoside triphosphate from the nucleotide pool and removal of an incorrect nucleotide (editing) by DNA polymerases at the 3′-end of the nascent DNA strand (Brutlag and Komeberg (1972) J. Biol. Chem. 247: 241-248); specific post- (or pre-) replication repair processes such as the removal of potentially mutagenic bases from the template (Lindahl (1974) Proc. Natl. Acad. Sci. 71: 3649-3653); and finally, post-replicative removal of incorrectly inserted deoxynucleotides from within newly replicated material by mismatch repair activities, which recognize the newly synthesized DNA by a strand targeting signal that is distinct from, and distal to, the mismatch (Clayerys and Lacks (1986) Microbiol. Rev. 50: 133-165; Modrich (1991) Annu Rev Genetics 25: 229-253).

[0883] One example of an error avoidance pathway is the GO system. The GO system is a pathway devoted to enhancing the fidelity of DNA replication and is responsible for protecting DNA from one form of oxidative damage (Michaels M. L., et al. (1992) Proc. Natl. Acad. Sci. USA 89:7022-7025). Although the system has been most fully characterized in E. coli, evidence exists for similar repair proteins in other prokaryotes and in higher eukaryotes. The GO system in E. coli is composed of at least three proteins, MutM, MutT and MutY (Michaels M. L., et al. (1992) J. Bacteriol. 174:6321-6325). These three proteins are responsible for removing an oxidatively damaged form of guanine (8-hydroxyguanine or 7,8-dihydro-8-oxoguanine, also known as a GO lesion) from DNA and the nucleotide pool and for the correction of error-prone translesion synthesis. 8-oxo-dGTP is inserted opposite to dA and dC residues of template DNA with almost equal efficiency, thus leading to AT to GC transversions. MutT, which is a small protein of about 12 to 15 Kd, specifically degrades 8-oxo-dGTP to the monophosphate with the concomitant release of pyrophosphate. The MutT protein has a weak GTPase activity (Bhatnagar, S. K. et al. (1991) J. Biol. Chem. 266: 9050-9054), resulting in the formation of dGMP and pyrophosphate (Maki, H. et al. (1992) Nature 355: 273-275).

[0884] A region of about 40 amino acid residues, which is found in the N-terminal portion of mutT, can also be found in a variety of other prokaryotic, viral, and eukaryotic proteins (Koonin E. V. (1993) Nucleic Acids Res. 21:4847-4847; Mejean V. et al. (1994) Mol. Microbiol. 11:323-330). The conserved 40 amino acid domain has been proposed to be involved in the active center of a family of pyrophosphate-releasing NTPases (Koonin E. V. (1993) supra). This domain contains four conserved glutamate residues.

[0885] Oxidative stress has been implicated as an important causative agent of mutagenesis, carcinogenesis, aging, and a number of diseases (Pacifici, R. E. et al. (1991) Gerontology 37:166-180). The MutM, MutT and MutY proteins have been implicated in the GO system, which is responsible for mediating one form of oxidative damage. Therefore, there exists a need for identifying novel genes and gene products that are involved in error avoidance pathways such as the GO system.

Summary of the 26493 Invention

[0886] The present invention is based, in part, on the discovery of a novel mutT family member, referred to herein as “26493”. The nucleotide sequence of a cDNA encoding 26493 is shown in SEQ ID NO: 16, and the amino acid sequence of a 26493 polypeptide is shown in SEQ ID NO: 17. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:18.

[0887] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 26493 protein or polypeptide, e.g., a biologically active portion of the 26493 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO: 17. In other embodiments, the invention provides isolated 26493 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO: 16, SEQ ID NO: 18, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO: 16, SEQ ID NO: 18, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:16, SEQ ID NO:18, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 26493 protein or an active fragment thereof.

[0888] In a related aspect, the invention further provides nucleic acid constructs that include a 26493 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 26493 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 26493 nucleic acid molecules and polypeptides.

[0889] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 26493-encoding nucleic acids.

[0890] In still another related aspect, isolated nucleic acid molecules that are antisense to a 26493 encoding nucleic acid molecule are provided.

[0891] In another aspect, the invention features 26493 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 26493-mediated or -related disorders. In another embodiment, the invention provides 26493 polypeptides having a 26493 activity. Preferred polypeptides are 26493 proteins including at least one mutT domain and, preferably, having a 26493 activity, e.g., a 26493 activity as described herein.

[0892] In other embodiments, the invention provides 26493 polypeptides, e.g., a 26493 polypeptide having the amino acid sequence shown in SEQ ID NO:17 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO: 17 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 16, SEQ ID NO: 18, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 26493 protein or an active fragment thereof.

[0893] In a related aspect, the invention further provides nucleic acid constructs which include a 26493 nucleic acid molecule described herein.

[0894] In a related aspect, the invention provides 26493 polypeptides or fragments operatively linked to non-26493 polypeptides to form fusion proteins. In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 26493 polypeptides or fragments thereof, e.g., a mutT domain of a 26493 polypeptide. In one embodiment, the antibodies or antigen-binding fragment thereof competitively inhibit the binding of a second antibody to a 26493 polypeptide or fragment thereof, e.g., a mutT domain of a 26493 polypeptide.

[0895] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 26493 polypeptides or nucleic acids.

[0896] In still another aspect, the invention provides a process for modulating 26493 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 26493 polypeptides or nucleic acids, such as conditions involving oxidative stress damage, deficient DNA repair, aging, mutagenesis, or conditions involving cells expressing the 26493 polypeptides, e.g., neural cells.

[0897] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing or differentiation, of a 26493-expressing cell, e.g., a hyperproliferative 26493-expressing cell. The method includes contacting the cell with an agent, e.g., a compound, (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 26493 polypeptide or nucleic acid. In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol.

[0898] In a preferred embodiment, the cell is a hyperproliferative cell, e.g., a cell found in a solid tumor, a soft tissue tumor, or a metastatic lesion.

[0899] In a preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 26493 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 26493 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[0900] In a preferred embodiment, the agent, e.g., the compound, is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[0901] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant activity of a 26493-expressing cell, in a subject. Preferably, the method includes comprising administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 26493 polypeptide or nucleic acid.

[0902] In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition.

[0903] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., proliferative disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 26493 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 26493 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 26493 nucleic acid or polypeptide expression can be detected by any method described herein.

[0904] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 26493 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[0905] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 26493 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 26493 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 26493 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue or pre-cancerous tissue. In other embodiments, the samples is from brain tissue.

[0906] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 26493 polypeptide or nucleic acid molecule, including for disease diagnosis.

[0907] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 26493 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 26493 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 26493 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[0908] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 26493

[0909] The human 26493 sequence (see SEQ ID NO:16, as recited in Example 10), which is approximately 1902 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 1212 nucleotides, including the termination codon. The coding sequence encodes a 404 amino acid protein (see SEQ ID NO:17, as recited in Example 10).

[0910] Human 26493 contains the following regions or other structural features:

[0911] a predicted mutT domain (PFAM Accession Number PF00293) located at about amino acid residues 122 to 251 of SEQ ID NO:17; which includes a mutT domain signature located at about amino acids 156 to 175;

[0912] two predicted protein kinase C phosphorylation sites (PS00005) located at about amino acids 36 to 38 of SEQ ID NO: 17;

[0913] five predicted casein kinase II phosphorylation sites (PS00006) located at about amino acids 67 to 70, 132 to 135, 163 to 166, 187 to 190, and 211 to 214 of SEQ ID NO:17;

[0914] one predicted N-glycosylation site (PS00001) located from about amino acids 374 to 377 of SEQ ID NO:17;

[0915] seven predicted N-myristylation sites (PS00008) located at about amino acids 43 to 48, 63 to 68, 85 to 90, 96 to 101, 207 to 212, 309 to 314, and 347 to 352 of SEQ ID NO:17.

[0916] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[0917] A plasmid containing the nucleotide sequence encoding human 26493 (clone “Fbh26493FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[0918] The 26493 protein contains a significant number of structural characteristics in common with members of the mutT subfamily. MutT proteins have been shown to have GTPase activity, e.g., an oxoguanine dGTPase activity (Bhatnagar, S. K. et al. (1991) J. Biol. Chem. 266: 9050-9054). The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[0919] MutT family members share a highly conserved region, which includes four conserved glutamate residues, which are believed to be involved in the active center of a family of pyrophosphate-releasing NTPases (Koonin E. V. (1993) supra). MutT family members are characterized by six positions containing conserved amino acid residues having the consensus sequence: GX₅EX₅[UA]XRE&XEEX₉& (SEQ ID NO:20), wherein U is a bulky aliphatic residue, i.e. I, L, V, or M; & is a bulky hydrophobic residue, i.e. I, L, V, M, F, Y or W; and X can be any amino acid (Koonin E. V. (1993) supra). MutT family members are components of the GO system. The GO system is a pathway devoted to enhancing the fidelity of DNA replication and is responsible for mediating one form of oxidative damage (Michaels M. L., et al. (1992) Proc. Natl. Acad. Sci. USA 89:7022-7025). MutT family members are known to mediate removal of an oxidatively damaged form of guanine (8-hydroxyguanine or 7,8-dihydro-8-oxoguanine, also known as a GO lesion) from DNA and the nucleotide pool and for the correction of error-prone translesion synthesis. MutT family members degrade 8-oxo-dGTP to the monophosphate with the concomitant release of pyrophosphate. The MutT proteins have been also shown to have GTPase activity, e.g., an oxoguanine dGTPase activity (Bhatnagar, S. K. et al. (1991) J. Biol. Chem. 266: 9050-9054), resulting in the formation of dGMP and pyrophosphate (Maki, H. et al. (1992) Nature 355: 273-275).

[0920] A 26493 polypeptide can include a “mutT domain” or regions homologous with a “mutT domain”. A 26493 can optionally further include one or more of: at least one N-glycosylation site; at least one, or preferably two protein kinase C sites; at least one, two, three, four, preferably five casein kinase II sites; or at least one, two, three, four, five, six, preferably seven N-myristoylation sites.

[0921] As used herein, the term “mutT domain” refers to a protein domain which is includes one, two, three, and preferably four glutamate residues, clustered in a stretch of about forty amino acids. The glutamate residues are preferably arranged in the following sequence: EX₆₋₉RELXEE (SEQ ID NO:21), wherein X can be any amino acid. For example, the mutT domain of 26493 includes a cluster of four glutamate residues located at amino acids 162, 171, 174 and 175 of SEQ ID NO:17 (FIG. 12). Preferably, the mutT domain has an amino acid sequence of about 50 to about 300 amino acid residues and having a bit score for the alignment of the sequence to the dGTPase domain (HMM) of at least 20. Preferably, a mutT domain includes at least about 80 to about 200 amino acids, more preferably about 100 to about 150 amino acid residues, about 120 to 130, or about 127, 128 or 129 amino acids and has a bit score for the alignment of the sequence to the mutT domain (HMM) of at least 40, preferably 50, more preferably 80 or greater. The mutT domain (HMM) has been assigned the PFAM Accession (PF00293) (http://genome.wustl.edu/Pfam/html). An alignment of the mutT domain (from about amino acids 122 to about 251 of SEQ ID NO: 17) of human 26493 with a consensus amino acid sequence derived from a hidden Markov model (PFAM) is depicted in FIG. 12.

[0922] In a preferred embodiment 26493 polypeptide or protein has a “mutT domain” or a region which includes at least about 80 to about 200 amino acids, more preferably about 100 to about 150 amino acid residues, about 120 to 130, or about 127, 128 or 129 amino acid residues and has at least about 50%, 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “mutT domain,” e.g., the mutT domain of human 26493 (e.g., residues 122 to 251 of SEQ ID NO:17).

[0923] To identify the presence of a “mutT” domain in a 26493 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against the Pfam database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of a “mutT” domain in the amino acid sequence of human 26493 at about residues 122 to 251 of SEQ ID NO:17 (see Example 10).

[0924] A 26493 family member can include at least one mutT domain. Furthermore, a 26493 family member can include at least one, preferably two protein kinase C phosphorylation sites (PS00005); at least 1 predicted N-glycosylation site (PS00001); at least one, two, three, four, and preferably five, predicted casein kinase II phosphorylation sites (PS00006); and at least one, two, three, four, five, six and preferably seven predicted N-myristylation sites (PS00008).

[0925] As the 26493 polypeptides of the invention may modulate 26493-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 26493-mediated or related disorders, as described below.

[0926] As used herein, a “26493 activity”, “biological activity of 26493” or “functional activity of 26493”, refers to an activity exerted by a 26493 protein, polypeptide or nucleic acid molecule. For example, a 26493 activity can be an activity exerted by 26493 in a physiological milieu on, e.g., a 26493-responsive cell or on a 26493 substrate, e.g., a protein substrate. A 26493 activity can be determined in vivo or in vitro. In one embodiment, a 26493 activity is a direct activity, such as an association with a 26493 target molecule. A “target molecule” or “binding partner” is a molecule with which a 26493 protein binds or interacts in nature, (e.g., an oxidatively damaged form of guanine (e.g. 8-hydroxyguanine or 7,8-dihydro-8-oxoguanine)). In another embodiment, a 26439 activity is an indirect activity, such as a cellular signaling activity mediated by interaction of the 26493 protein with a second protein.

[0927] The features of the 26493 molecules of the present invention can provide similar biological activities as mutT/dGTPase family members. For example, the 26493 proteins of the present invention can have one or more of the following activities: (1) ability to catalyze the removal of an oxidatively damaged form of guanine (e.g., 8-hydroxyguanine or 7,8-dihydro-8-oxoguanine) from a DNA molecule and the nucleotide pool; (2) ability to repair DNA, e.g., correct error-prone translesion synthesis; (3) ability to degrade an oxidatively damaged form of guanine (e.g., 8-oxo-dGTP) to the monophosphate with the concomitant release of pyrophosphate; (4) ability to have NTPase (e.g., GTPase) activity; (5) ability to interact with (e.g., bind to) an NTPase, e.g., GTPase; (6) the ability to modulate (e.g., to protect from)oxidative stress damage; (7) the ability to modulate aging; (8) ability to modulate mutagenesis, carcinogenesis, and/or aberrant or deficient cellular proliferation or differentiation; or (9) ability to modulate the activity of cells expressing the 26493 polypeptides, e.g., neural cells. Thus, the 26493 molecules can act as novel diagnostic targets and therapeutic agents for controlling cancers, aging, disorders involving prostate or neural cells, cellular differentiation disorders, and the like.

[0928] The 26493 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, disorders associated with bone metabolism, immune disorders (e.g., inflammatory disorders), cardiovascular disorders, neural disorders, prostate disorders, among others.

[0929] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[0930] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth. Examples of such cells include cells having an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[0931] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[0932] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[0933] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[0934] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin. A hematopoietic neoplastic disorder can arise from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[0935] 26493 mRNA was found to be expressed in erythrocytes, neutrophils, megakaryocytes, lymph nodes, and normal tonsil. Thus, the 26493 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders or disorders of the red blood cell. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy. Disorders involving red cells include, but are not limited to, anemias, such as hemolytic anemias, including hereditary spherocytosis, hemolytic disease due to erythrocyte enzyme defects: glucose-6-phosphate dehydrogenase deficiency, sickle cell disease, thalassemia syndromes, paroxysmal nocturnal hemoglobinuria, immunohemolytic anemia, and hemolytic anemia resulting from trauma to red cells; and anemias of diminished erythropoiesis, including megaloblastic anemias, such as anemias of vitamin B12 deficiency: pernicious anemia, and anemia of folate deficiency, iron deficiency anemia, anemia of chronic disease, aplastic anemia, pure red cell aplasia, and other forms of marrow failure.

[0936] Since 26493 mRNA was found to be expressed in various tissues of the heart, the 26493 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of disorders involving the heart. Examples of disorders involving the heart or “cardiovascular disorder” include, but are not limited to, a disease, disorder, or state involving the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. Examples of such disorders include hypertension, atherosclerosis, coronary artery spasm, congestive heart failure, coronary artery disease, valvular disease, arrhythmias, and cardiomyopathies.

[0937] 26493 mRNA was found to be expressed in brain tissue, including the cortex and hypothalamus, with lower levels expressed in the spinal chord and nerves. Accordingly, the molecules of the invention may mediate disorders involving aberrant activities of these cells, for example neurodegenerative or brain disorders. Accordingly, the molecules of the invention may mediate disorders involving aberrant activities of brain cells, for example neurodegenerative disorders. Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B₁) deficiency and vitamin B₁₂ deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[0938] 26493 mRNA was found to be expressed in kidney tissue. Accordingly, the molecules of the invention may also mediate disorders involving aberrant activities of kidney cells. Disorders involving the kidney include, but are not limited to, congenital anomalies including, but not limited to, cystic diseases of the kidney, that include but are not limited to, cystic renal dysplasia, autosomal dominant (adult) polycystic kidney disease, autosomal recessive (childhood) polycystic kidney disease, and cystic diseases of renal medulla, which include, but are not limited to, medullary sponge kidney, and nephronophthisis-uremic medullary cystic disease complex, acquired (dialysis-associated) cystic disease, such as simple cysts; glomerular diseases including pathologies of glomerular injury that include, but are not limited to, in situ immune complex deposition, that includes, but is not limited to, anti-GBM nephritis, Heymann nephritis, and antibodies against planted antigens, circulating immune complex nephritis, antibodies to glomerular cells, cell-mediated immunity in glomerulonephritis, activation of alternative complement pathway, epithelial cell injury, and pathologies involving mediators of glomerular injury including cellular and soluble mediators, acute glomerulonephritis, such as acute proliferative (poststreptococcal, postinfectious) glomerulonephritis, including but not limited to, poststreptococcal glomerulonephritis and nonstreptococcal acute glomerulonephritis, rapidly progressive (crescentic) glomerulonephritis, nephrotic syndrome, membranous glomerulonephritis (membranous nephropathy), minimal change disease (lipoid nephrosis), focal segmental glomerulosclerosis, membranoproliferative glomerulonephritis, IgA nephropathy (Berger disease), focal proliferative and necrotizing glomerulonephritis (focal glomerulonephritis), hereditary nephritis, including but not limited to, Alport syndrome and thin membrane disease (benign familial hematuria), chronic glomerulonephritis, glomerular lesions associated with systemic disease, including but not limited to, systemic lupus erythematosus, Henoch-Schönlein purpura, bacterial endocarditis, diabetic glomerulosclerosis, amyloidosis, fibrillary and immunotactoid glomerulonephritis, and other systemic disorders; diseases affecting tubules and interstitium, including acute tubular necrosis and tubulointerstitial nephritis, including but not limited to, pyelonephritis and urinary tract infection, acute pyelonephritis, chronic pyelonephritis and reflux nephropathy, and tubulointerstitial nephritis induced by drugs and toxins, including but not limited to, acute drug-induced interstitial nephritis, analgesic abuse nephropathy, nephropathy associated with nonsteroidal anti-inflammatory drugs, and other tubulointerstitial diseases including, but not limited to, urate nephropathy, hypercalcemia and nephrocalcinosis, and multiple myeloma; diseases of blood vessels including benign nephrosclerosis, malignant hypertension and accelerated nephrosclerosis, renal artery stenosis, and thrombotic microangiopathies including, but not limited to, classic (childhood) hemolytic-uremic syndrome, adult hemolytic-uremic syndrome/thrombotic thrombocytopenic purpura, idiopathic HUS/TTP, and other vascular disorders including, but not limited to, atherosclerotic ischemic renal disease, atheroembolic renal disease, sickle cell disease nephropathy, diffuse cortical necrosis, and renal infarcts; urinary tract obstruction (obstructive uropathy); urolithiasis (renal calculi, stones); and tumors of the kidney including, but not limited to, benign tumors, such as renal papillary adenoma, renal fibroma or hamartoma (renomedullary interstitial cell tumor), angiomyolipoma, and oncocytoma, and malignant tumors, including renal cell carcinoma (hypemephroma, adenocarcinoma of kidney), which includes urothelial carcinomas of renal pelvis.

[0939] 26493 mRNA was found to be expressed in normal pancreas tissues, and thus the molecules of the invention may also mediate disorders involving aberrant activities of pancreas cells. Disorders involving the pancreas include those of the exocrine pancreas such as congenital anomalies, including but not limited to, ectopic pancreas; pancreatitis, including but not limited to, acute pancreatitis; cysts, including but not limited to, pseudocysts; tumors, including but not limited to, cystic tumors and carcinoma of the pancreas; and disorders of the endocrine pancreas such as, diabetes mellitus; islet cell tumors, including but not limited to, insulinomas, gastrinomas, and other rare islet cell tumors.

[0940] 26493 mRNA was also found to be expressed in normal and cancerous lung tissue. Accordingly, the molecules of the invention may also mediate disorders involving aberrant activities of lung cells. Examples of disorders of the lung include, but are not limited to, congenital anomalies; atelectasis; diseases of vascular origin, such as pulmonary congestion and edema, including hemodynamic pulmonary edema and edema caused by microvascular injury, adult respiratory distress syndrome (diffuse alveolar damage), pulmonary embolism, hemorrhage, and infarction, and pulmonary hypertension and vascular sclerosis; chronic obstructive pulmonary disease, such as emphysema, chronic bronchitis, bronchial asthma, and bronchiectasis; diffuse interstitial (infiltrative, restrictive) diseases, such as pneumoconioses, sarcoidosis, idiopathic pulmonary fibrosis, desquamative interstitial pneumonitis, hypersensitivity pneumonitis, pulmonary eosinophilia (pulmonary infiltration with eosinophilia), Bronchiolitis obliterans-organizing pneumonia, diffuse pulmonary hemorrhage syndromes, including Goodpasture syndrome, idiopathic pulmonary hemosiderosis and other hemorrhagic syndromes, pulmonary involvement in collagen vascular disorders, and pulmonary alveolar proteinosis; complications of therapies, such as drug-induced lung disease, radiation-induced lung disease, and lung transplantation; tumors, such as bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[0941] 26493 mRNA was also found to be expressed in normal and cancerous prostate tissue. Accordingly, the molecules of the invention may also mediate disorders involving aberrant activities of prostate cells. Disorders involving the prostate include, but are not limited to, inflammations, benign enlargement, for example, nodular hyperplasia (benign prostatic hypertrophy or hyperplasia), and tumors such as carcinoma. As used herein, “a prostate disorder” refers to an abnormal condition occurring in the male pelvic region characterized by, e.g., male sexual dysfunction and/or urinary symptoms. This disorder may be manifested in the form of genitourinary inflammation (e.g., inflammation of smooth muscle cells) as in several common diseases of the http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h5http://164.195.100.1/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h7prostate including prostatitis, benign prostatic hyperplasia and cancer, e.g., adenocarcinoma or carcinoma, of the http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h6http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h8prostate.

[0942] 26493 has also been found in normal and tumor breast tissue. Accordingly, the molecules of the invention may also mediate disorders involving aberrant activities of breast cells. Disorders of the breast include, but are not limited to, disorders of development; inflammations, including but not limited to, acute mastitis, periductal mastitis, periductal mastitis (recurrent subareolar abscess, squamous metaplasia of lactiferous ducts), mammary duct ectasia, fat necrosis, granulomatous mastitis, and pathologies associated with silicone breast implants; fibrocystic changes; proliferative breast disease including, but not limited to, epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors including, but not limited to, stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, no special type, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[0943] As the 26493 mRNA is expressed in endothelial cells, the molecules of the invention can be used as diagnostic/therapeutic targets for disorders involving the blood vessels. Disorders involving blood vessels include, but are not limited to, responses of vascular cell walls to injury, such as endothelial dysfunction and endothelial activation and intimal thickening; vascular diseases including, but not limited to, congenital anomalies, such as arteriovenous fistula, atherosclerosis, and hypertensive vascular disease, such as hypertension; inflammatory disease—the vasculitides, such as giant cell (temporal) arteritis, Takayasu arteritis, polyarteritis nodosa (classic), Kawasaki syndrome (mucocutaneous lymph-node syndrome), microscopic polyanglitis (microscopic polyarteritis, hypersensitivity or leukocytoclastic anglitis), Wegener granulomatosis, thromboanglitis obliterans (Buerger disease), vasculitis associated with other disorders, and infectious arteritis; Raynaud disease; aneurysms and dissection, such as abdominal aortic aneurysms, syphilitic (luetic) aneurysms, and aortic dissection (dissecting hematoma); disorders of veins and lymphatics, such as varicose veins, thrombophlebitis and phlebothrombosis, obstruction of superior vena cava (superior vena cava syndrome), obstruction of inferior vena cava (inferior vena cava syndrome), and lymphangitis and lymphedema; tumors, including benign tumors and tumor-like conditions, such as hemangioma, lymphangioma, glomus tumor (glomangioma), vascular ectasias, and bacillary angiomatosis, and intermediate-grade (borderline low-grade malignant) tumors, such as Kaposi sarcoma and hemangloendothelioma, and malignant tumors, such as angiosarcoma and hemangiopericytoma; and pathology of therapeutic interventions in vascular disease, such as balloon angioplasty and related techniques and vascular replacement, such as coronary artery bypass graft surgery.

[0944] The 26493 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO: 17 thereof are collectively referred to as “polypeptides or proteins of the invention” or “26493 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “26493 nucleic acids.” 26493 molecules refer to 26493 nucleic acids, polypeptides, and antibodies.

[0945] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[0946] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[0947] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[0948] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO: 16 or SEQ ID NO: 18, corresponds to a naturally-occurring nucleic acid molecule.

[0949] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein.

[0950] As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding a 26493 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 26493 protein or derivative thereof.

[0951] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 26493 protein is at least 10% pure. In a preferred embodiment, the preparation of 26493 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-26493 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-26493 chemicals. When the 26493 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[0952] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 26493 without abolishing or substantially altering a 26493 activity. Preferably the alteration does not substantially alter the 26493 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 26493, results in abolishing a 26493 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 26493 are predicted to be particularly unamenable to alteration.

[0953] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 26493 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 26493 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 26493 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO: 16 or SEQ ID NO: 18, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[0954] As used herein, a “biologically active portion” of a 26493 protein includes a fragment of a 26493 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 26493 molecule and a non-26493 molecule or between a first 26493 molecule and a second 26493 molecule (e.g., a dimerization interaction). Biologically active portions of a 26493 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 26493 protein, e.g., the amino acid sequence shown in SEQ ID NO: 17, which include less amino acids than the full length 26493 proteins, and exhibit at least one activity of a 26493 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 26493 protein, e.g., the ability to catalyze the degradation of an oxidatively damaged form of guanine (e.g., 8-hydroxyguanine or 7,8-dihydro-8-oxoguanine) to the monophosphate with the concomitant release of pyrophosphate. A biologically active portion of a 26493 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 26493 protein can be used as targets for developing agents which modulate a 26493 mediated activity, e.g., an oxoguanine dGTPase activity.

[0955] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[0956] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[0957] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[0958] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[0959] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[0960] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 26493 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 26493 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[0961] Particular 26493 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO: 17. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO: 17 are termed substantially identical.

[0962] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO: 16 or 18 are termed substantially identical.

[0963] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[0964] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[0965] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[0966] Various aspects of the invention are described in further detail below.

[0967] Isolated Nucleic Acid Molecules of 26493

[0968] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 26493 polypeptide described herein, e.g., a full-length 26493 protein or a fragment thereof, e.g., a biologically active portion of 26493 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 26493 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[0969] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO: 16, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 26493 protein (i.e., “the coding region” of SEQ ID NO: 16, as shown in SEQ ID NO: 18), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO: 16 (e.g., SEQ ID NO: 18) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein from about amino acid 1 to 1298, or 1838 to 1902 of SEQ ID NO: 17.

[0970] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO: 16 or SEQ ID NO: 18, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO: 16 or SEQ ID NO: 18, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:16 or 18, thereby forming a stable duplex.

[0971] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about: 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO: 16 or SEQ ID NO: 18, or a portion, preferably of the same length, of any of these nucleotide sequences.

[0972] 26493 Nucleic Acid Fragments

[0973] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO: 16 or 18. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of a 26493 protein, e.g., an immunogenic or biologically active portion of a 26493 protein. A fragment can comprise those nucleotides of SEQ ID NO:16, which encode a mutT domain of human 26493. The nucleotide sequence determined from the cloning of the 26493 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 26493 family members, or fragments thereof, as well as 26493 homologues, or fragments thereof, from other species.

[0974] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 50, 100, 150, 200, 250, 300, 350, 400, 404, 450 amino acids in length Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[0975] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 26493 nucleic acid fragment can include a sequence corresponding to a mutT domain at locations in the translated 26493 polypeptide described herein.

[0976] 26493 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO: 16 or SEQ ID NO: 18, or of a naturally occurring allelic variant or mutant of SEQ ID NO: 16 or SEQ ID NO:18.

[0977] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0978] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes, e.g., a mutT domain from about amino acid 122 to 251 of SEQ ID NO: 17.

[0979] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 26493 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a mutT domain from about amino acid 122 to 251 of SEQ ID NO: 17.

[0980] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[0981] A nucleic acid fragment encoding a “biologically active portion of a 26493 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:16 or 18, which encodes a polypeptide having a 26493 biological activity (e.g., the biological activities of the 26493 proteins are described herein), expressing the encoded portion of the 26493 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 26493 protein. For example, a nucleic acid fragment encoding a biologically active portion of 26493 includes a mutT domain, e.g., amino acid residues about 122 to 251 of SEQ ID NO: 17. A nucleic acid fragment encoding a biologically active portion of a 26493 polypeptide, may comprise a nucleotide sequence which is greater than 300 or more nucleotides in length.

[0982] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300 or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:16, or SEQ ID NO:18.

[0983] 26493 Nucleic Acid Variants

[0984] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:16 or SEQ ID NO:18. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 26493 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:17. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0985] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[0986] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[0987] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO: 16 or 18, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0988] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO: 17 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO:17 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 26493 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 26493 gene.

[0989] Preferred variants include those that are correlated with modulating (stimulating and/or enhancing or inhibiting) cellular proliferation, differentiation, or tumorogenesis; modulating aging; or modulating neural activity.

[0990] Allelic variants of 26493, e.g., human 26493, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 26493 protein within a population that maintain the ability to bind oxidatively damaged guanine substrates, and to catalyze the removal of an oxidatively damaged form of guanine (e.g., 8-hydroxyguanine or 7,8-dihydro-8-oxoguanine) from DNA and/or the nucleotide pool. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:17, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 26493, e.g., human 26493, protein within a population that do not have the ability to bind oxidatively damaged guanine substrates, and to catalyze the removal of an oxidatively damaged form of guanine (e.g., 8-hydroxyguanine or 7,8-dihydro-8-oxoguanine) from DNA and/or the nucleotide pool. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO: 17, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[0991] Moreover, nucleic acid molecules encoding other 26493 family members and, thus, which have a nucleotide sequence which differs from the 26493 sequences of SEQ ID NO:16 or SEQ ID NO: 18 are intended to be within the scope of the invention.

[0992] Antisense Nucleic Acid Molecules, Ribozymes and Modified 26493 Nucleic Acid Molecules

[0993] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 26493. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 26493 coding strand, or to only a portion thereof (e.g., the coding region of human 26493 corresponding to SEQ ID NO: 18). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 26493 (e.g., the 5′ and 3′untranslated regions).

[0994] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 26493 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 26493 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 26493 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[0995] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[0996] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 26493 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[0997] In yet another embodiment, the antisense nucleic acid molecule of the invention is an α-anomeric nucleic acid molecule. An α-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual β-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[0998] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 26493-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 26493 cDNA disclosed herein (i.e., SEQ ID NO:16 or SEQ ID NO:18), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 26493-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 26493 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[0999] 26493 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 26493 (e.g., the 26493 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 26493 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[1000] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or calorimetric.

[1001] A 26493 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[1002] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[1003] PNAs of 26493 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 26493 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[1004] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[1005] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 26493 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 26493 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[1006] Isolated 26493 Polypeptides

[1007] In another aspect, the invention features, an isolated 26493 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-26493 antibodies. 26493 protein can be isolated from cells or tissue sources using standard protein purification techniques. 26493 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[1008] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[1009] In a preferred embodiment, a 26493 polypeptide has one or more of the following characteristics:

[1010] (i) it catalyzes the removal of an oxidatively damaged form of guanine (e.g., 8-hydroxyguanine or 7,8-dihydro-8-oxoguanine) from a DNA molecule and the nucleotide pool;

[1011] (ii) it repairs DNA, e.g., corrects error-prone translesion synthesis;

[1012] (iii) it degrades an oxidatively damaged form of guanine (e.g., 8-oxo-dGTP) to the monophosphate with the concomitant release of pyrophosphate;

[1013] (iv) it has GTPase activity, e.g., an oxoguanine dGTPase activity;

[1014] (v) it has an amino acid composition of a 26493 polypeptide, e.g., a polypeptide of SEQ ID NO: 17;

[1015] (vi) it has an overall sequence similarity of at least 60%, preferably at least 70, more preferably at least 80, 90, or 95%, with a polypeptide of SEQ ID NO: 17;

[1016] (vii) it can be found in a human tissue, e.g., neural tissue;

[1017] (viii) it can be found in a subcellular organelle, e.g., mitochondria.

[1018] (ix) it has a mutT domain with a sequence similarity which is preferably about 70%, 80%, 90% or 95%, with amino acid residues about 122 to about 251 of SEQ ID NO: 17;

[1019] (x) it has at least three, preferably at least 4 glutamate residues found in the amino acid sequence of the protein of SEQ ID NO: 17; or

[1020] (xi) it has at least 5, preferably at least 8, and most preferably at least 9 of the 10 cysteines found in the amino acid sequence of the native protein.

[1021] In a preferred embodiment the 26493 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:17. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO: 17 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:17. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the mutT domain.

[1022] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 26493 proteins differ in amino acid sequence from SEQ ID NO: 17, yet retain biological activity.

[1023] In one embodiment, the protein includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO: 17.

[1024] A 26493 protein or fragment is provided which varies from the sequence of SEQ ID NO:17 in regions defined by amino acids about 1 to 216, and from about amino acid 444 to about 455 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO: 17 in regions defined by amino acids about 217 to about 443. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[1025] In one embodiment, a biologically active portion of a 26493 protein includes a mutT domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 26493 protein.

[1026] In a preferred embodiment, the 26493 protein has an amino acid sequence shown in SEQ ID NO:17. In other embodiments, the 26493 protein is substantially identical to SEQ ID NO: 17. In yet another embodiment, the 26493 protein is substantially identical to SEQ ID NO: 17 and retains the functional activity of the protein of SEQ ID NO: 17, as described in detail in the subsections above.

[1027] 26493 Chimeric or Fusion Proteins

[1028] In another aspect, the invention provides 26493 chimeric or fusion proteins. As used herein, a 26493 “chimeric protein” or “fusion protein” includes a 26493 polypeptide linked to a non-26493 polypeptide. A “non-26493 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 26493 protein, e.g., a protein which is different from the 26493 protein and which is derived from the same or a different organism. The 26493 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 26493 amino acid sequence. In a preferred embodiment, a 26493 fusion protein includes at least one (or two) biologically active portion of a 26493 protein. The non-26493 polypeptide can be fused to the N-terminus or C-terminus of the 26493 polypeptide.

[1029] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-26493 fusion protein in which the 26493 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 26493. Alternatively, the fusion protein can be a 26493 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 26493 can be increased through use of a heterologous signal sequence.

[1030] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[1031] The 26493 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 26493 fusion proteins can be used to affect the bioavailability of a 26493 substrate. 26493 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 26493 protein; (ii) mis-regulation of the 26493 gene; and (iii) aberrant post-translational modification of a 26493 protein.

[1032] Moreover, the 26493-fusion proteins of the invention can be used as immunogens to produce anti-26493 antibodies in a subject, to purify 26493 ligands and in screening assays to identify molecules which inhibit the interaction of 26493 with a 26493 substrate.

[1033] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 26493-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 26493 protein. Variants of 26493 Proteins

[1034] In another aspect, the invention also features a variant of a 26493 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 26493 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 26493 protein. An agonist of the 26493 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 26493 protein. An antagonist of a 26493 protein can inhibit one or more of the activities of the naturally occurring form of the 26493 protein by, for example, competitively modulating a 26493-mediated activity of a 26493 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 26493 protein.

[1035] Variants of a 26493 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 26493 protein for agonist or antagonist activity.

[1036] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 26493 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 26493 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[1037] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 26493 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 26493 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[1038] Cell based assays can be exploited to analyze a variegated 26493 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 26493 in a substrate-dependent manner. The transfected cells are then contacted with 26493 and the effect of the expression of the mutant on signaling by the 26493 substrate can be detected. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 26493 substrate, and the individual clones further characterized.

[1039] In another aspect, the invention features a method of making a 26493 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 26493 polypeptide, e.g., a naturally occurring 26493 polypeptide. The method includes: altering the sequence of a 26493 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[1040] In another aspect, the invention features a method of making a fragment or analog of a 26493 polypeptide a biological activity of a naturally occurring 26493 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 26493 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[1041] Anti-26493 Antibodies

[1042] In another aspect, the invention provides an anti-26493 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[1043] The anti-26493 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[1044] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[1045] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 26493 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-26493 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)₂ fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[1046] The anti-26493 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[1047] Phage display and combinatorial methods for generating anti-26493 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[1048] In one embodiment, the anti-26493 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[1049] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[1050] An anti-26493 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[1051] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[1052] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 26493 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[1053] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[1054] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 26493 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[1055] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[1056] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[1057] In preferred embodiments an antibody can be made by immunizing with purified 26493 antigen, or a fragment thereof, e.g., a fragment described herein, membrane associated antigen, tissue, e.g., crude tissue preparations, whole cells, preferably living cells, lysed cells, or cell fractions, e.g., membrane fractions.

[1058] A full-length 26493 protein or, antigenic peptide fragment of 26493 can be used as an immunogen or can be used to identify anti-26493 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 26493 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:17 and encompasses an epitope of 26493. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[1059] Fragments of 26493 which include residues about 1 to 281, 122 to 251, or 302 to 404 of SEQ ID NO: 17 can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody. A fragment which includes amino acid residues about 122 to 251 can be used to make antibodies against the mutT domain of the 26493 protein. Antibodies can be made against the hydrophilic regions of the 26493 protein, e.g. the sequence from about amino acid residue 360 to 370 of SEQ ID NO:17. Similarly, a fragment of 26493 which includes from about amino acids 85 to 101, about 121 to 125 and from about 350 to 360 of SEQ ID NO: 17 can be used to make an antibody against a hydrophobic region of the 26493 protein. Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[1060] Antibodies which bind only native 26493 protein, only denatured or otherwise non-native 26493 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 26493 protein.

[1061] Preferred epitopes encompassed by the antigenic peptide are regions of 26493 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 26493 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 26493 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[1062] In preferred embodiments antibodies can bind one or more of purified antigen, membrane associated antigen, tissue, e.g., tissue sections, whole cells, preferably living cells, lysed cells, cell fractions, e.g., membrane fractions.

[1063] The anti-26493 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 26493 protein.

[1064] In a preferred embodiment the antibody has: effector function; and can fix complement. In other embodiments the antibody does not; recruit effector cells; or fix complement.

[1065] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example, it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[1066] In a preferred embodiment, an anti-26493 antibody alters (e.g., increases or decreases) the 26493 activity (for example, the ability to catalyze the removal of an oxidatively damaged form of guanine (e.g., 8-hydroxyguanine or 7,8-dihydro-8-oxoguanine) from DNA and/or the nucleotide pool) of a 26493 polypeptide. For example, the antibody can bind at or in proximity to the active site, e.g., to an epitope that includes a residue located from about 122 to 251 of SEQ ID NO: 17.

[1067] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e.g., ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[1068] An anti-26493 antibody (e.g., monoclonal antibody) can be used to isolate 26493 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-26493 antibody can be used to detect 26493 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-26493 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or ³H.

[1069] The invention also includes a nucleic acids which encodes an anti-26493 antibody, e.g., an anti-26493 antibody described herein. Also included are vectors which include the nucleic acid and sells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[1070] The invention also includes cell lines, e.g., hybridomas, which make an anti-26493 antibody, e.g., and antibody described herein, and method of using said cells to make a 26493 antibody.

[1071] 26493 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[1072] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[1073] A vector can include a 26493 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 26493 proteins, mutant forms of 26493 proteins, fusion proteins, and the like).

[1074] The recombinant expression vectors of the invention can be designed for expression of 26493 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[1075] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[1076] Purified fusion proteins can be used in 26493 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 26493 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[1077] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[1078] The 26493 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[1079] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[1080] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[1081] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the α-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[1082] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[1083] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 26493 nucleic acid molecule within a recombinant expression vector or a 26493 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[1084] A host cell can be any prokaryotic or eukaryotic cell. For example, a 26493 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are known to those skilled in the art.

[1085] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[1086] A host cell of the invention can be used to produce (i.e., express) a 26493 protein. Accordingly, the invention further provides methods for producing a 26493 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 26493 protein has been introduced) in a suitable medium such that a 26493 protein is produced. In another embodiment, the method further includes isolating a 26493 protein from the medium or the host cell.

[1087] In another aspect, the invention features, a cell or purified preparation of cells which include a 26493 transgene, or which otherwise misexpress 26493. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 26493 transgene, e.g., a heterologous form of a 26493, e.g., a gene derived from humans (in the case of a non-human cell). The 26493 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 26493, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 26493 alleles or for use in drug screening.

[1088] In another aspect, the invention features, a human cell, e.g., a hematopoietic stem cell, transformed with nucleic acid which encodes a subject 26493 polypeptide.

[1089] Also provided are cells, preferably human cells, e.g., human hematopoietic or fibroblast cells, in which an endogenous 26493 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 26493 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 26493 gene. For example, an endogenous 26493 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[1090] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 26493 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 26493 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 26493 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[1091] 26493 Transgenic Animals

[1092] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 26493 protein and for identifying and/or evaluating modulators of 26493 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 26493 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[1093] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 26493 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 26493 transgene in its genome and/or expression of 26493 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 26493 protein can further be bred to other transgenic animals carrying other transgenes.

[1094] 26493 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[1095] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[1096] Uses of 26493

[1097] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[1098] The proteins of the invention can be used in vitro to synthesize products, wherein the synthesis of such products requires activities such as catalyzing the removal of an oxidatively damaged form of guanine (e.g., 8-hydroxyguanine or 7,8-dihydro-8-oxoguanine) from a DNA molecule and the nucleotide pool, or degrading an oxidatively damaged forms of guanine, e.g., an oxoguanine dGTPase activity.

[1099] The isolated nucleic acid molecules of the invention can be used, for example, to express a 26493 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 26493 mRNA (e.g., in a biological sample) or a genetic alteration in a 26493 gene, and to modulate 26493 activity, as described further below. The 26493 proteins can be used to treat disorders characterized by insufficient or excessive production of a 26493 substrate or production of 26493 inhibitors. In addition, the 26493 proteins can be used to screen for naturally occurring 26493 substrates, to screen for drugs or compounds which modulate 26493 activity, as well as to treat disorders characterized by insufficient or excessive production of 26493 protein or production of 26493 protein forms which have decreased, aberrant or unwanted activity compared to 26493 wild type protein Moreover, the anti-26493 antibodies of the invention can be used to detect and isolate 26493 proteins, regulate the bioavailability of 26493 proteins, and modulate 26493 activity.

[1100] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 26493 polypeptide is provided. The method includes: contacting the compound with the subject 26493 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 26493 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 26493 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 26493 polypeptide. Screening methods are discussed in more detail below.

[1101] 26493 Screening Assays

[1102] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 26493 proteins, have a stimulatory or inhibitory effect on, for example, 26493 expression or 26493 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 26493 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 26493 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[1103] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of a 26493 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 26493 protein or polypeptide or a biologically active portion thereof.

[1104] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[1105] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[1106] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[1107] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 26493 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 26493 activity is determined. Determining the ability of the test compound to modulate 26493 activity can be accomplished by monitoring, for example, an oxoguanine dGTPase activity. The cell, for example, can be of mammalian origin, e.g., human.

[1108] The ability of the test compound to modulate 26493 binding to a compound, e.g., a 26493 substrate, or to bind to 26493 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 26493 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 26493 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 26493 binding to a 26493 substrate in a complex. For example, compounds (e.g., 26493 substrates) can be labeled with ¹²⁵I, ³⁵S, ¹⁴C, or ³H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[1109] The ability of a compound (e.g., a 26493 substrate) to interact with 26493 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 26493 without the labeling of either the compound or the 26493. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 26493.

[1110] In yet another embodiment, a cell-free assay is provided in which a 26493 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 26493 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 26493 proteins to be used in assays of the present invention include fragments which participate in interactions with non-26493 molecules, e.g., fragments with high surface probability scores.

[1111] Soluble and/or membrane-bound forms of isolated proteins (e.g., 26493 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)_(n), 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[1112] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[1113] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[1114] In another embodiment, determining the ability of the 26493 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[1115] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[1116] It may be desirable to immobilize either 26493, an anti-26493 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 26493 protein, or interaction of a 26493 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/26493 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 26493 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 26493 binding or activity determined using standard techniques.

[1117] Other techniques for immobilizing either a 26493 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 26493 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[1118] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[1119] In one embodiment, this assay is performed utilizing antibodies reactive with 26493 protein or target molecules but which do not interfere with binding of the 26493 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 26493 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 26493 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 26493 protein or target molecule.

[1120] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11:141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[1121] In a preferred embodiment, the assay includes contacting the 26493 protein or biologically active portion thereof with a known compound which binds 26493 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 26493 protein, wherein determining the ability of the test compound to interact with a 26493 protein includes determining the ability of the test compound to preferentially bind to 26493 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[1122] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 26493 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 26493 protein through modulation of the activity of a downstream effector of a 26493 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[1123] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[1124] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[1125] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[1126] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[1127] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[1128] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[1129] In yet another aspect, the 26493 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 26493 (“26493-binding proteins” or “26493-bp”) and are involved in 26493 activity. Such 26493-bps can be activators or inhibitors of signals by the 26493 proteins or 26493 targets as, for example, downstream elements of a 26493-mediated signaling pathway.

[1130] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 26493 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 26493 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 26493-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 26493 protein.

[1131] In another embodiment, modulators of 26493 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 26493 mRNA or protein evaluated relative to the level of expression of 26493 mRNA or protein in the absence of the candidate compound. When expression of 26493 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 26493 mRNA or protein expression. Alternatively, when expression of 26493 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 26493 mRNA or protein expression. The level of 26493 mRNA or protein expression can be determined by methods described herein for detecting 26493 mRNA or protein.

[1132] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 26493 protein can be confirmed in vivo, e.g., in an animal such as an animal model for cancer.

[1133] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 26493 modulating agent, an antisense 26493 nucleic acid molecule, a 26493-specific antibody, or a 26493-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[1134] 26493 Detection Assays

[1135] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 26493 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[1136] 26493 Chromosome Mapping

[1137] The 26493 nucleotide sequences or portions thereof can be used to map the location of the 26493 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 26493 sequences with genes associated with disease.

[1138] Briefly, 26493 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 26493 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 26493 sequences will yield an amplified fragment.

[1139] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[1140] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 26493 to a chromosomal location.

[1141] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[1142] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[1143] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[1144] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 26493 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[1145] 26493 Tissue Typing

[1146] 26493 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[1147] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 26493 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[1148] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:16 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO: 18 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[1149] If a panel of reagents from 26493 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[1150] Use of Partial 26493 Sequences in Forensic Biology

[1151] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[1152] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO: 16 (e.g., fragments derived from the noncoding regions of SEQ ID NO: 16 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[1153] The 26493 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 26493 probes can be used to identify tissue by species and/or by organ type.

[1154] In a similar fashion, these reagents, e.g., 26493 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[1155] Predictive Medicine of 26493

[1156] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[1157] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 26493.

[1158] Such disorders include, e.g., a disorder associated with the misexpression of 26493 gene; a disorder of the immune system.

[1159] The method includes one or more of the following:

[1160] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 26493 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[1161] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 26493 gene;

[1162] detecting, in a tissue of the subject, the misexpression of the 26493 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[1163] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 26493 polypeptide.

[1164] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 26493 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[1165] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO: 16, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 26493 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[1166] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 26493 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 26493.

[1167] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[1168] In preferred embodiments the method includes determining the structure of a 26493 gene, an abnormal structure being indicative of risk for the disorder.

[1169] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 26493 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[1170] Diagnostic and Prognostic Assays of 26493

[1171] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 26493 molecules and for identifying variations and mutations in the sequence of 26493 molecules.

[1172] Expression Monitoring and Profiling:

[1173] The presence, level, or absence of 26493 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 26493 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 26493 protein such that the presence of 26493 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 26493 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 26493 genes; measuring the amount of protein encoded by the 26493 genes; or measuring the activity of the protein encoded by the 26493 genes.

[1174] The level of mRNA corresponding to the 26493 gene in a cell can be determined both by in situ and by in vitro formats.

[1175] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 26493 nucleic acid, such as the nucleic acid of SEQ ID NO: 16, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 26493 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[1176] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 26493 genes.

[1177] The level of mRNA in a sample that is encoded by one of 26493 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[1178] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 26493 gene being analyzed.

[1179] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 26493 mRNA, or genomic DNA, and comparing the presence of 26493 mRNA or genomic DNA in the control sample with the presence of 26493 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 26493 transcript levels.

[1180] A variety of methods can be used to determine the level of protein encoded by 26493. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)₂) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[1181] The detection methods can be used to detect 26493 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 26493 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 26493 protein include introducing into a subject a labeled anti-26493 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-26493 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[1182] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 26493 protein, and comparing the presence of 26493 protein in the control sample with the presence of 26493 protein in the test sample.

[1183] The invention also includes kits for detecting the presence of 26493 in a biological sample. For example, the kit can include a compound or agent capable of detecting 26493 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 26493 protein or nucleic acid.

[1184] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[1185] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[1186] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 26493 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as pain or deregulated cell proliferation.

[1187] In one embodiment, a disease or disorder associated with aberrant or unwanted 26493 expression or activity is identified. A test sample is obtained from a subject and 26493 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 26493 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 26493 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[1188] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 26493 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a cell proliferative or differentiative disorder.

[1189] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 26493 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 26493 (e.g., other genes associated with a 26493-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[1190] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 26493 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used, for example, to diagnose a disorder, e.g., a disorder as described herein, in a subject. The method can be used to monitor a treatment for a disorder, e.g., a disorder as described herein, in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[1191] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 26493 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[1192] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 26493 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[1193] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[1194] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 26493 expression.

[1195] 26493 Arrays and Uses Thereof

[1196] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 26493 molecule (e.g., a 26493 nucleic acid or a 26493 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm², and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[1197] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 26493 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 26493. Each address of the subset can include a capture probe that hybridizes to a different region of a 26493 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 26493 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 26493 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 26493 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[1198] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[1199] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 26493 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 26493 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-26493 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[1200] In another aspect, the invention features a method of analyzing the expression of 26493. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 26493-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[1201] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 26493. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 26493. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[1202] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 26493 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[1203] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[1204] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 26493-associated disease or disorder; and processes, such as a cellular transformation associated with a 26493-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 26493-associated disease or disorder

[1205] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 26493) that could serve as a molecular target for diagnosis or therapeutic intervention.

[1206] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 26493 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 26493 polypeptide or fragment thereof. For example, multiple variants of a 26493 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[1207] The polypeptide array can be used to detect a 26493 binding compound, e.g., an antibody in a sample from a subject with specificity for a 26493 polypeptide or the presence of a 26493-binding protein or ligand.

[1208] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 26493 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[1209] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 26493 or from a cell or subject in which a 26493 mediated response has been elicited, e.g., by contact of the cell with 26493 nucleic acid or protein, or administration to the cell or subject 26493 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 26493 (or does not express as highly as in the case of the 26493 positive plurality of capture probes) or from a cell or subject which in which a 26493 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 26493 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[1210] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 26493 or from a cell or subject in which a 26493-mediated response has been elicited, e.g., by contact of the cell with 26493 nucleic acid or protein, or administration to the cell or subject 26493 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 26493 (or does not express as highly as in the case of the 26493 positive plurality of capture probes) or from a cell or subject which in which a 26493 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[1211] In another aspect, the invention features a method of analyzing 26493, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 26493 nucleic acid or amino acid sequence; comparing the 26493 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 26493.

[1212] Detection of 26493 Variations or Mutations

[1213] The methods of the invention can also be used to detect genetic alterations in a 26493 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 26493 protein activity or nucleic acid expression. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 26493-protein, or the mis-expression of the 26493 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 26493 gene; 2) an addition of one or more nucleotides to a 26493 gene; 3) a substitution of one or more nucleotides of a 26493 gene, 4) a chromosomal rearrangement of a 26493 gene; 5) an alteration in the level of a messenger RNA transcript of a 26493 gene, 6) aberrant modification of a 26493 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 26493 gene, 8) a non-wild type level of a 26493-protein, 9) allelic loss of a 26493 gene, and 10) inappropriate post-translational modification of a 26493-protein.

[1214] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 26493-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 26493 gene under conditions such that hybridization and amplification of the 26493-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[1215] In another embodiment, mutations in a 26493 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[1216] In other embodiments, genetic mutations in 26493 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 26493 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 26493 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 26493 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[1217] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 26493 gene and detect mutations by comparing the sequence of the sample 26493 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[1218] Other methods for detecting mutations in the 26493 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[1219] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 26493 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[1220] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 26493 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 26493 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[1221] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[1222] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[1223] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[1224] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 26493 nucleic acid.

[1225] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO: 16 or the complement of SEQ ID NO: 16. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[1226] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 26493. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[1227] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the T_(m) of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[1228] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 26493 nucleic acid.

[1229] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 26493 gene.

[1230] Use of 26493 Molecules as Surrogate Markers

[1231] The 26493 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 26493 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 26493 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[1232] The 26493 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 26493 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-26493 antibodies may be employed in an immune-based detection system for a 26493 protein marker, or 26493-specific radiolabeled probes may be used to detect a 26493 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[1233] The 26493 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 26493 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 26493 DNA may correlate 26493 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[1234] Pharmaceutical Compositions of 26493

[1235] The nucleic acid and polypeptides, fragments thereof, as well as anti-26493 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[1236] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as Water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[1237] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[1238] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[1239] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[1240] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[1241] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[1242] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[1243] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[1244] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[1245] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[1246] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[1247] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[1248] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[1249] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[1250] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[1251] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive metal ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, and puromycin and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine and vinblastine).

[1252] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, α-interferon, α-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[1253] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[1254] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[1255] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[1256] Methods of Treatment for 26493

[1257] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 26493 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[1258] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 26493 molecules of the present invention or 26493 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[1259] In one aspect, the invention provides a method for preventing, in a subject, a disease or condition associated with an aberrant or unwanted 26493 expression or activity, by administering to the subject a 26493 or an agent which modulates 26493 expression or at least one 26493 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 26493 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 26493 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 26493 aberrance, for example, a 26493, 26493 agonist or 26493 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[1260] It is possible that some 26493 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[1261] The 26493 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, immune disorders, cardiovascular disorders and brain disorders as described above, as well as disorders associated with bone metabolism, liver disorders, viral diseases, pain or metabolic disorders.

[1262] Aberrant expression and/or activity of 26493 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 26493 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 26493 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 26493 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[1263] Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[1264] Additionally, 26493 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 26493 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 26493 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[1265] Additionally, 26493 may play an important role in the regulation of metabolism or pain disorders. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[1266] As discussed, successful treatment of 26493 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 26493 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)₂ and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[1267] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[1268] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[1269] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 26493 expression is through the use of aptamer molecules specific for 26493 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 26493 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[1270] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 26493 disorders. For a description of antibodies, see the Antibody section above.

[1271] In circumstances wherein injection of an animal or a human subject with a 26493 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 26493 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 26493 protein. Vaccines directed to a disease characterized by 26493 expression may also be generated in this fashion.

[1272] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[1273] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 26493 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[1274] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED₅₀ with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC₅₀ (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography. Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 26493 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 26493 can be readily monitored and used in calculations of IC₅₀.

[1275] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC₅₀. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[1276] Another aspect of the invention pertains to methods of modulating 26493 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 26493 or agent that modulates one or more of the activities of 26493 protein activity associated with the cell. An agent that modulates 26493 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 26493 protein (e.g., a 26493 substrate or receptor), a 26493 antibody, a 26493 agonist or antagonist, a peptidomimetic of a 26493 agonist or antagonist, or other small molecule.

[1277] In one embodiment, the agent stimulates one or 26493 activities. Examples of such stimulatory agents include active 26493 protein and a nucleic acid molecule encoding 26493. In another embodiment, the agent inhibits one or more 26493 activities. Examples of such inhibitory agents include antisense 26493 nucleic acid molecules, anti-26493 antibodies, and 26493 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 26493 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 26493 expression or activity. In another embodiment, the method involves administering a 26493 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 26493 expression or activity.

[1278] Stimulation of 26493 activity is desirable in situations in which 26493 is abnormally downregulated and/or in which increased 26493 activity is likely to have a beneficial effect. For example, stimulation of 26493 activity is desirable in situations in which a 26493 is downregulated and/or in which increased 26493 activity is likely to have a beneficial effect. Likewise, inhibition of 26493 activity is desirable in situations in which 26493 is abnormally upregulated and/or in which decreased 26493 activity is likely to have a beneficial effect.

[1279] 26493 Pharmacogenomics

[1280] The 26493 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 26493 activity (e.g., 26493 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 26493 associated disorders associated with aberrant or unwanted 26493 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 26493 molecule or 26493 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 26493 molecule or 26493 modulator.

[1281] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[1282] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[1283] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 26493 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[1284] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 26493 molecule or 26493 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[1285] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 26493 molecule or 26493 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[1286] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 26493 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 26493 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[1287] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 26493 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 26493 gene expression, protein levels, or upregulate 26493 activity, can be monitored in clinical trials of subjects exhibiting decreased 26493 gene expression, protein levels, or downregulated 26493 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 26493 gene expression, protein levels, or downregulate 26493 activity, can be monitored in clinical trials of subjects exhibiting increased 26493 gene expression, protein levels, or upregulated 26493 activity. In such clinical trials, the expression or activity of a 26493 gene, and preferably, other genes that have been implicated in, for example, a 26493-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[1288] 26493 Informatics

[1289] The sequence of a 26493 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 26493. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 26493 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[1290] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[1291] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or are presented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[1292] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[1293] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[1294] Thus, in one aspect, the invention features a method of analyzing 26493, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 26493 nucleic acid or amino acid sequence; comparing the 26493 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 26493. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[1295] The method can include evaluating the sequence identity between a 26493 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[1296] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[1297] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[1298] Thus, the invention features a method of making a computer readable record of a sequence of a 26493 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[1299] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 26493 sequence, or record, in machine-readable form; comparing a second sequence to the 26493 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 26493 sequence includes a sequence being compared. In a preferred embodiment the 26493 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 26493 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[1300] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 26493-associated disease or disorder or a pre-disposition to a 26493-associated disease or disorder, wherein the method comprises the steps of determining 26493 sequence information associated with the subject and based on the 26493 sequence information, determining whether the subject has a 26493-associated disease or disorder or a pre-disposition to a 26493-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[1301] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 26493-associated disease or disorder or a pre-disposition to a disease associated with a 26493 wherein the method comprises the steps of determining 26493 sequence information associated with the subject, and based on the 26493 sequence information, determining whether the subject has a 26493-associated disease or disorder or a pre-disposition to a 26493-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 26493 sequence of the subject to the 26493 sequences in the database to thereby determine whether the subject as a 26493-associated disease or disorder, or a pre-disposition for such.

[1302] The present invention also provides in a network, a method for determining whether a subject has a 26493 associated disease or disorder or a pre-disposition to a 26493-associated disease or disorder associated with 26493, said method comprising the steps of receiving 26493 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 26493 and/or corresponding to a 26493-associated disease or disorder, and based on one or more of the phenotypic information, the 26493 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 26493-associated disease or disorder or a pre-disposition to a 26493-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[1303] The present invention also provides a method for determining whether a subject has a 26493-associated disease or disorder or a pre-disposition to a 26493-associated disease or disorder, said method comprising the steps of receiving information related to 26493 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 26493 and/or related to a 26493-associated disease or disorder, and based on one or more of the phenotypic information, the 26493 information, and the acquired information, determining whether the subject has a 26493-associated disease or disorder or a pre-disposition to a 26493-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[1304] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 58224 Invention

[1305] Helicases are mechanochemical enzymes that couple the energy of nucleoside triphosphate hydrolysis to the dehybridization or unwinding of duplex nucleic acid molecules. Nucleic acid unwinding is of central importance in a variety of nucleic acid processes that include the transcription, translation, recombination, and replication of genetic material. The importance of helicases is further underscored by the large number of DNA or RNA helicases identified in prokaryotic and eukaryotic organisms.

[1306] Helicase domains, for example, “DEAD/H” helicase domains, are found in a number of proteins, including initiation factor eIF-4A, splicing proteins PRP5 and PRP28, and SNF2-domain containing proteins. SNF2-domain containing proteins have been reported to be involved in a variety of processes, including transcription regulation (e.g., SNF2, STH1, brahma, MOTI), maintenance of chromosome stability during mitosis (e.g., lodestar), and various aspects of processing DNA damage, including DNA excision repair (e.g., RAD16 and ERCC6), recombinational pathways (e.g., RAD54) and post-replication daughter strand gap repair (e.g., RAD5) (Eisen, J. A. et al. (1995) Nucleic Acid Res. 23(14):2715-23).

Summary of the 58224 Invention

[1307] The present invention is based, in part, on the discovery of a novel helicase family member, referred to herein as “58224”. The nucleotide sequence of a cDNA encoding 58224 is shown in SEQ ID NO:22, and the amino acid sequence of a 58224 polypeptide is shown in SEQ ID NO:23. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:24 (See Example 15).

[1308] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 58224 protein or polypeptide, e.g., a biologically active portion of the 58224 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:23. In other embodiments, the invention provides isolated 58224 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:22 or SEQ ID NO:24 or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:22, SEQ ID NO:24, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:22, SEQ ID NO:24, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 58224 protein or an active fragment thereof.

[1309] In a related aspect, the invention further provides nucleic acid constructs that include a 58224 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 58224 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 58224 nucleic acid molecules and polypeptides.

[1310] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 58224-encoding nucleic acids.

[1311] In still another related aspect, isolated nucleic acid molecules that are antisense to a 58224 encoding nucleic acid molecule are provided.

[1312] In another aspect, the invention features, 58224 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 58224-mediated or -related disorders. In another embodiment, the invention provides 58224 polypeptides having a 58224 activity. Preferred polypeptides are 58224 proteins including at least one helicase domain, e.g., a C-terminal helicase domain, or an SNF2 domain, e.g., an N-terminal SNF2 domain, and, preferably, having a 58224 activity, e.g., a 58224 activity as described herein.

[1313] In other embodiments, the invention provides 58224 polypeptides, e.g., a 58224 polypeptide having the amino acid sequence shown in SEQ ID NO:23, or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:23, or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:22, SEQ ID NO:24, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 58224 protein or an active fragment thereof.

[1314] In a related aspect, the invention further provides nucleic acid constructs which include a 58224 nucleic acid molecule described herein.

[1315] In a related aspect, the invention provides 58224 polypeptides or fragments operatively linked to non-58224 polypeptides to form fusion proteins.

[1316] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind to, 58224 polypeptides. In other embodiments, the antibody or antigen-binding fragment thereof reacts with, or more preferably binds specifically to a 58224 polypeptide or a fragment thereof, e.g., a helicase domain, or an SNF2 domain of a 58224 polypeptide. In one embodiment, the antibody or antigen-binding fragment thereof competitively inhibits the binding of a second antibody to its target epitope.

[1317] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 58224 polypeptides or nucleic acids.

[1318] In still another aspect, the invention provides a process for modulating 58224 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions, e.g., disorders or diseases, related to aberrant activity or expression of the 58224 polypeptides or nucleic acids, such as conditions involving aberrant or deficient cellular proliferation or differentiation, or tumor invasion or metastasis.

[1319] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing or differentiation, of a 58224-expressing cell (e.g., a 58224-expressing hyperproliferative cell), comprising contacting the cell with a an agent, e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 58224 polypeptide or nucleic acid, thereby inhibiting the proliferation or inducing the killing or differentiation of the 58224-expressing cell.

[1320] In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol.

[1321] In a preferred embodiment, the cell, e.g., the hyperproliferative cell is found in a solid tumor, a soft tissue tumor, or a metastatic lesion. Preferably, the tumor is a sarcoma, a carcinoma, or an adenocarcinoma. Preferably, the cell, e.g., the hyperproliferative cell is found in a cancerous or pre-cancerous tissue, e.g., a cancerous or pre-cancerous tissue where a 58224 polypeptide or nucleic acid is expressed, e.g., breast, ovarian, lung, colon, or brain cancer, or a liver metastasis. Most preferably, the cell, e.g., the hyperproliferative cell is found in a tumor from the breast, ovary, colon, liver or lung.

[1322] In a preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 58224 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). The inhibitor can also be a trypsin inhibitor or a derivative thereof, or a peptidomimetic, e.g., a phosphonate analog of a peptide substrate.

[1323] In a preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 58224 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[1324] In a preferred embodiment, the agent, e.g., the compound, is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include an anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[1325] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant activity, e.g., aberrant cellular proliferation or differentiation, of a 58224-expressing cell, in a subject. Preferably, the method includes comprising administering to the subject (e.g., a mammal, e.g., a human) an effective amount of an agent, e.g., a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 58224 polypeptide or nucleic acid.

[1326] In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition. Most preferably, the disorder is a cancer, e.g., a solid tumor, a soft tissue tumor, or a metastatic lesion. Preferably, the cancer is a sarcoma, a carcinoma, or an adenocarcinoma. Preferably, the cancer is found in a tissue where a 58224 polypeptide or nucleic acid is expressed, e.g., breast, ovarian, lung, colon, or brain cancer, or a liver metastasis. Most preferably, the cancer is found in the breast, ovary, colon, liver and lung.

[1327] In a preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 58224 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). The inhibitor can also be a trypsin inhibitor or a derivative thereof, or a peptidomimetic, e.g., a phosphonate analog of a peptide substrate.

[1328] In a preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 58224 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[1329] In a preferred embodiment, the agent, e.g., the compound, is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[1330] The invention also provides assays for determining the activity of or the presence or absence of 58224 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis. Preferably, the biological sample includes a cancerous or pre-cancerous cell or tissue. For example, the cancerous tissue can be a solid tumor, a soft tissue tumor, or a metastatic lesion. Preferably, the cancerous tissue is a sarcoma, a carcinoma, or an adenocarcinoma. Preferably, the cancerous tissue is from the breast, ovary, colon, lung, liver, or brain.

[1331] In a further aspect the invention provides assays for determining the presence or absence of a genetic alteration in a 58224 polypeptide or nucleic acid molecule in a sample, for, e.g., disease diagnosis. Preferably, the sample includes a cancer cell or tissue. For example, the cancer can be a solid tumor, a soft tissue tumor, or a metastatic lesion. Preferably, the cancer is a sarcoma, a carcinoma, or an adenocarcinoma. Preferably, the cancer is a breast, ovarian, colon, lung, liver, or brain cancer.

[1332] In a still further aspect, the invention provides methods for staging a disorder, or evaluating the efficacy of a treatment of a disorder, e.g., a proliferative disorder, e.g., a cancer (e.g., a breast, ovarian, colon, liver or lung cancer). The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 58224 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 58224 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder.

[1333] In a preferred embodiment, the disorder is a cancer of the breast, ovary, colon, lung, or liver. The level of 58224 nucleic acid or polypeptide expression can be detected by any method described herein.

[1334] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 58224 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[1335] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 58224 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 58224 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 58224 nucleic acid or polypeptide expression can be detected by any method described herein.

[1336] In a preferred embodiment, the sample includes cells obtained from a cancerous tissue where a 58224 polypeptide or nucleic acid is obtained, e.g., a cancer of the breast, ovary, colon, lung, or liver.

[1337] In a preferred embodiment, the sample is a tissue sample (e.g., a biopsy), a bodily fluid, cultured cells (e.g., a tumor cell line).

[1338] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 58224 polypeptide or nucleic acid molecule, including for disease diagnosis.

[1339] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 58224 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 58224 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 58224 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[1340] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 58224

[1341] The human 58224 sequence (SEQ ID NO:22), which is approximately 2798 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 2514 nucleotides (nucleotides 106-2621 of SEQ ID NO:22; SEQ ID NO:24). The coding sequence encodes a 838 amino acid protein (SEQ ID NO:23).

[1342] Human 58224 contains the following regions or other structural features: nine N-glycosylation sites (PFAM Accession PS00001) located at about amino acid residues 137 to 140, 162 to 165, 389 to 392, 485 to 488, 517 to 520, 641 to 644, 682 to 685, 760 to 763, and 811 to 814 of SEQ ID NO:23; three predicted cAMP- and cGMP-dependent protein kinase phosphorylation sites (PS00004) at about amino acids 103 to 106, 291 to 294, and 509 to 512 of SEQ ID NO:23; ten protein kinase C phosphorylation sites (PS00005) at about amino acids 106 to 108, 190 to 192, 473 to 475, 495 to 497, 505 to 507, 512 to 514, 598 to 600, 652 to 654, 673 to 675, and 793 to 795; 16 predicted casein kinase II sites (PS00006) located at about amino acids 84 to 87, 143 to 146, 305 to 308, 331 to 334, 413 to 416, 419 to 422, 423 to 426, 495 to 498, 519 to 522, 626 to 629, 643 to 646, 650 to 653, 663 to 666, 684 to 687, 793 to 796, and 832 to 835 of SEQ ID NO:23; three tyrosine phosphorylation kinase phosphorylation sites (PS00007) at about amino acids 58 to 65, 128 to 136 and 474 to 480 of SEQ ID NO:23; four predicted N-myristoylation sites (PS00008) from about amino 8 to 13, 251 to 256, 678 to 683 and 754 to 759 of SEQ ID NO:23; and one predicted leucine zipper pattern (PS00029) at amino acid 379 to 400 of SEQ ID NO:23.

[1343] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[1344] A plasmid containing the nucleotide sequence encoding human 58224 (clone Fbh58224FL) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[1345] The 58224 protein contains a significant number of structural characteristics in common with members of the helicase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[1346] The helicase family of proteins comprises a number of related enzymes that share structural homology and a common catalytic mechanism whereby the enzyme converts the energy from NTP, e.g., ATP, hydrolysis into the mechanical energy required for unwinding of nucleic acid duplexes. For example, DEAD-type helicases catalyze the unwinding of ribonucleic acids during RNA splicing. Other RNA helicases include members of the DEAH or DEXH helicase subfamilies. The SNF2 subfamily of helicases or helicase-like proteins comprises proteins that can act as transcriptional regulators for multiple genes. Thus, this family includes enzymes critical for the proper function of many physiological systems, including replication, transcription, and cellular proliferation and differentiation.

[1347] A 58224 polypeptide can include an “SNF2 N-terminal domain” or regions homologous with an “SNF2 N-terminal domain.” As used herein, the term “SNF2 N-terminal domain” refers to a protein domain having an amino acid sequence of about 200 to 500 amino acids and having a bit score for the alignment of the sequence to the SNF2 N-terminal domain (HMM) of at least 150. Preferably, an SNF2 N-terminal domain includes at least about 250 to 450 amino acids, preferably about 300 to 400 amino acid residues, or more preferably about 350 amino acid residues and has a bit score for the alignment of the sequence to the SNF2 N-terminal domain domain (HMM) of at least about 160, 170, 180, 190, 200, 225, 250, 275, 300, 350 or greater. An alignment of the SNF2 N-terminal domain (amino acids 226 to 577 of SEQ ID NO:23) of human 58224 with a consensus amino acid sequence derived from a hidden Markov model is depicted in FIG. 14A.

[1348] In a preferred embodiment 58224 polypeptide or protein has a “SNF2 N-terminal domain” or a region which includes at least about 200 to 500 amino acids, preferably about 300 to 400 amino acid residues, or more preferably about 350 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “SNF2 N-terminal domain,” e.g., the SNF2 N-terminal domain of human 58224 (e.g., residues 226-577 of SEQ ID NO:23).

[1349] A 58224 polypeptide can also include a “helicase conserved C-terminal domain” or regions homologous with a “helicase conserved C-terminal domain”. As used herein, the term “helicase conserved C-terminal domain” refers to a protein domain having an amino acid sequence of about 50-200 amino acids and having a bit score for the alignment of the sequence to the helicase conserved C-terminal domain (HMM) of at least 50. Preferably, a helicase conserved C-terminal domain includes at least about 60-150 amino acids, preferably about 70-100 amino acid residues, or more preferably at least about 80 amino acid residues and has a bit score for the alignment of the sequence to the helicase conserved C-terminal domain (HMM) of at least about 60, 70, 80, 90, 95, or greater. An alignment of the helicase conserved C-terminal domain (amino acids 629 to 712 of SEQ ID NO:23) of human 58224 with a consensus amino acid sequence derived from a hidden Markov model is depicted in FIG. 14B. Typically, a helicase conserved C-terminal domain catalyzes the separation of two complementary strands of a duplex nucleic acid, and/or that can catalyze hydrolysis of an NTP (e.g., an ATP). For example, a helicase domain can show DNA-dependent ATPase activity.

[1350] In a preferred embodiment 58224 polypeptide or protein has a “helicase conserved C-terminal domain” or a region which includes at least about 50-200 amino acids, preferably about 50-100 amino acid residues, or more preferably at least about 80 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “helicase conserved C-terminal domain”, e.g., the helicase conserved C-terminal domain of human 58224 (e.g., residues 629-712 of SEQ ID NO:23).

[1351] To identify the presence of a “SNF2 N-terminal domain” or a “helicase conserved C-terminal domain” in a 58224 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of a “SNF2 N-terminal domain” in the amino acid sequence of human 58224 at about residues 226-577 of SEQ ID NO:23 (see FIG. 14A) and a “helicase conserved C-terminal domain” in the amino acid sequence of human 58224 at about residues 629-712 of SEQ ID NO:23 (FIG. 14B).

[1352] A 58224 family member can include: a SNF2 N-terminal domain, a helicase conserved C-terminal domain, a lymphoid specific lymphocyte specific helicase domain 1 or a lymphoid specific lymphocyte specific helicase domain 2.

[1353] As the 58224 polypeptides of the invention may modulate 58224-mediated activities, they may be useful for developing novel diagnostic and therapeutic agents for 58224-mediated or related disorders, as described below.

[1354] As used herein, a “58224 activity”, “biological activity of 58224” or “functional activity of 58224”, refers to an activity exerted by a 58224 protein, polypeptide or nucleic acid molecule on e.g., a 58224-responsive cell or on a 58224 substrate, e.g., a protein substrate, as determined in vivo or in vitro. In one embodiment, a 58224 activity is a direct activity, such as an association with a 58224 target molecule. A “target molecule” or “binding partner” is a molecule with which a 58224 protein binds or interacts in nature. In an exemplary embodiment, is a 58224 substrate. A 58224 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 58224 protein with a 58224 substrate. For example, the 58224 proteins of the present invention can have one or more of the following activities: (1) catalyze the separation of two complementary strands of a duplex nucleic acid; (2) bind DNA; (3) bind nucleoside 5′-triphosphate (NTP); (4) modulate replication; (5) modulate recombination; (6) modulate DNA repair; (7) modulate transcription, (8) modulate translation, (9) modulate RNA splicing, (10) modulate nucleic acid metabolism; (11) act as a transcriptional regulator; (12) disrupt mononucleosomal structure; (13) modulate the alteration of chromatin structure; (14) modulate rearrangement of receptor genes encoding lymphoctye antigen receptors; (15) modulate T- and B-cell developmental events; or (16) it is an agonist or antagonist of any of (1)-(15).

[1355] Based on the above-described sequence similarities, the 58224 molecules of the present invention are predicted to have similar biological activities as helicase family members, e.g., modulates nearly all metabolic transactions of nucleic acids, e.g., replication, repair, recombination, and transcription of DNA, and splicing of RNA. Thus, the 58224 molecules can act as novel diagnostic targets and therapeutic agents for controlling disorders associated with aberrant helicase activity, e.g., aberrant proliferation or differentiation, e.g., Werner syndrome, Bloom syndrome, cockayne's syndrome, xerodema pigmentosum, lymphoid proliferative diseases, cancer and α-Thalassemia X-linked mental retardation.

[1356] 58824 mRNA expression is upregulated in some tumor tissues, e.g., in human breast, ovary, lung, and colon tumors, in liver metastases, and in fetal liver (see Example 16). Accordingly, 58224 molecules may act as novel therapeutic and prophylactic agents for controlling disorders or diseases involving aberrant activities of the cells in which these molecules are expressed, and as diagnostic markers useful for indicating the presence or predisposition towards developing such disorders, or monitoring the progression or regression of a disorder. Examples of such disorders include cellular proliferative and/or differentiative disorders, e.g., breast, ovary, lung or colon cancers.

[1357] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of breast, ovary, colon, lung, and liver origin.

[1358] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth, i.e., an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[1359] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[1360] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[1361] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[1362] Examples of cellular proliferative and/or differentiative disorders of the breast include, but are not limited to, proliferative breast disease including, e.g., epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors, e.g., stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[1363] Examples of cellular proliferative and/or differentiative disorders involving the ovary include, for example, polycystic ovarian disease, Stein-leventhal syndrome, Pseudomyxoma peritonei and stromal hyperthecosis; ovarian tumors such as, tumors of coelomic epithelium, serous tumors, mucinous tumors, endometeriod tumors, clear cell adenocarcinoma, cystadenofibroma, brenner tumor, surface epithelial tumors; germ cell tumors such as mature (benign) teratomas, monodermal teratomas, immature malignant teratomas, dysgerminoma, endodermal sinus tumor, choriocarcinoma; sex cord-stomal tumors such as, granulosa-theca cell tumors, thecoma-fibromas, androblastomas, hill cell tumors, and gonadoblastoma; and metastatic tumors such as Krukenberg tumors.

[1364] Examples of cellular proliferative and/or differentiative disorders of the lung include, but are not limited to, bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[1365] Examples of cellular proliferative and/or differentiative disorders of the colon include, but are not limited to, non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors.

[1366] Examples of cellular proliferative and/or differentiative disorders of the liver include, but are not limited to, nodular hyperplasias, adenomas, and malignant tumors, including primary carcinoma of the liver and metastatic tumors.

[1367] The 58224 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:23 thereof are collectively referred to as “polypeptides or proteins of the invention” or “58224 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “58224 nucleic acids.” 58224 molecules refer to 58224 nucleic acids, polypeptides, and antibodies.

[1368] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA) and RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA generated, e.g., by the use of nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[1369] The term “isolated or purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules that are present in the natural source of the nucleic acid. For example, with respect to genomic DNA, the term “isolated” includes nucleic acid molecules that are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences that naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences that naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[1370] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[1371] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature (e.g., encodes a natural protein).

[1372] As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules that include an open reading frame encoding a 58224 protein, preferably a mammalian 58224 protein, and further can include non-coding regulatory sequences and introns.

[1373] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. In one embodiment, the language “substantially free” means preparation of 58224 protein having less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-58224 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-58224 chemicals. When the 58224 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[1374] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 58224 (e.g., the sequence of SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______) without abolishing or more preferably, without substantially altering a biological activity of the 58224 protein, whereas an “essential” amino acid residue results in such a change. For example, amino acid residues that are conserved among the polypeptides of the present invention, e.g., those present in the helicase domain or the SNF2 domain, are predicted to be particularly unamenable to alteration.

[1375] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 58224 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 58224 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 58224 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[1376] As used herein, a “biologically active portion” of a 58224 protein includes a fragment of a 58224 protein that participates in an interaction between a 58224 molecule and a non-58224 molecule. Biologically active portions of a 58224 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 58224 protein, e.g., the amino acid sequence shown in SEQ ID NO:23, which include less amino acids than the full length 58224 protein and exhibit at least one activity of a 58224 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 58224 protein, e.g., helicase activity. A biologically active portion of a 58224 protein can be a polypeptide that is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 58224 protein can be used as targets for developing agents that modulate a 58224 mediated activity, e.g., helicase activity.

[1377] Particularly preferred 58224 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:23. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:23 are termed substantially identical.

[1378] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:22 or 24, are termed substantially identical.

[1379] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[1380] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, even more preferably at least 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence (e.g., when aligning a second sequence to the 58224 amino acid sequence of SEQ ID NO:23 having 838 amino acid residues, at least 251, preferably at least 335, more preferably at least 419, even more preferably at least 503, and even more preferably at least 587, 536, 670, or 754 amino acid residues are aligned). The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”). The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[1381] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch (J. Mol. Biol. (48):444-453 (1970)) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used if the practitioner is uncertain about what parameters should be applied to determine if a molecule is within the invention) is using a Blossum 62 scoring matrix with a gap open penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[1382] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of Meyers and Miller (CABIOS, 4:11-17 (1989)) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[1383] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 58224 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 58224 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25(17):3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[1384] “Misexpression or aberrant expression”, as used herein, refers to a non-wild type pattern of gene expression, at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over or under expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of decreased expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[1385] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[1386] A “purified preparation of cells”, as used herein, refers to, in the case of plant or animal cells, an in vitro preparation of cells and not an entire intact plant or animal. In the case of cultured cells or microbial cells, it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[1387] Various aspects of the invention are described in further detail below.

[1388] Isolated Nucleic Acid Molecules of 58224

[1389] In one aspect, the invention provides an isolated or purified nucleic acid molecule that encodes a 58224 polypeptide described herein, e.g., a full-length 58224 protein or a fragment thereof, e.g., a biologically active portion of a 58224 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 58224 mRNA, or fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[1390] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmids deposited with ATCC as Accession Number ______, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the 58224 protein (i.e., “the coding region,”) as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:22 (e.g., the sequences corresponding to SEQ ID NO:24) and, e.g., no flanking sequences that normally accompany the subject sequence.

[1391] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule that is a complement of the nucleotide sequence shown in SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ such that it can hybridize to the nucleotide sequence shown in SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, thereby forming a stable duplex.

[1392] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence that is at least about: 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:22, 24, or the entire length of the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______. In the case of an isolated nucleic acid molecule which is longer than or equivalent in length to the reference sequence, e.g., SEQ ID NO:22 or 24, the comparison is made with the full length of the reference sequence. Where the isolated nucleic acid molecule is shorter that the reference sequence, e.g., shorter than SEQ ID NO:22 or 24, the comparison is made to a segment of the reference sequence of the same length (excluding any loop required by the homology calculation).

[1393] 58224 Nucleic Acid Fragments

[1394] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______. For example, such a nucleic acid molecule can include a fragment that can be used as a probe or primer or a fragment encoding a portion of a 58224 protein, e.g., an immunogenic or biologically active portion of a 58224 protein. A fragment can comprise nucleotides encoding amino acids 226 to 577 of SEQ ID NO:23, which encode a SNF2 domain, or amino acids 629 to 712 of SEQ ID NO:23, which encode a C-terminal helicase domain of human 58224. The nucleotide sequence determined from the cloning of the 58224 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 58224 family members, or fragments thereof, as well as 58224 homologues or fragments thereof, from other species.

[1395] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment that includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 212 or 273 amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[1396] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment also can include one or more domains, regions, or functional sites described herein.

[1397] In a preferred embodiment, the fragment is at least 50, 100, 143, 150, 200, 250, 294, 300, 350, 398, 400, 450, 500, 550, 600, 650, 700, 750, 800, 820, 850, 900, 950, 1000, 1500, 2000, 2253 nucleotides in length, and hybridizes under a stringent hybridization condition as described herein to a nucleic acid molecule of SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______.

[1398] 58224 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringent hybridization condition as described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:22, 24, the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or a naturally occurring allelic variant or mutant of SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______.

[1399] In a preferred embodiment the nucleic acid is a probe that is at least 5 or 10 and less than 500, 300, or 200 base pains in length, and more preferably is less than 100 or less than 50 base pairs in length. It should be identical, or differ by 1, or less than 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison, the sequences should be aligned for maximum homology. “Looped” out sequences in the alignment from deletions, insertions, or mismatches, are considered differences.

[1400] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid that encodes a helicase domain: amino acids 226 to 577 of SEQ ID NO:23 or 629 to 712 of SEQ ID NO:23.

[1401] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 58224 sequence, e.g., a region, domain, or site described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100 or 200 base pairs in length. The primers should be identical, or differ by one base from a sequence disclosed herein or from a naturally occurring variant. E.g., primers suitable for amplifying all or a portion of a helicase domain: amino acids 226 to 577 of SEQ ID NO:23 or 629 to 712 of SEQ ID NO:23.

[1402] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[1403] A nucleic acid fragment encoding a “biologically active portion of a 58224 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, which encodes a polypeptide having a 58224 biological activity (e.g., the biological activities of the 58224 proteins described herein), expressing the encoded portion of the 58224 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 58224 protein. For example, a nucleic acid fragment encoding a biologically active portion of 58224 includes a helicase domain, e.g., amino acid residues 226 to 577 of SEQ ID NO:23 or residues 629 to 712 of SEQ ID NO:23. A nucleic acid fragment encoding a biologically active portion of a 58224 polypeptide, may comprise a nucleotide sequence that is greater than about 80, 100, 200, 300 or more nucleotides in length (e.g., greater than about 350 nucleotides in length).

[1404] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300 or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:22 or 24.

[1405] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is at least about 300, 350, 398, 400, 450, 500, 550, 600, 650, 700, 750, 800, 820, 850, 900, 950, 1000, 1500, 2000, 2253, or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:22 or 24.

[1406] In a preferred embodiment, a nucleic acid fragment has a nucleotide sequence other than (e.g., differs by one or more nucleotides from) Genbank accession number AK001201.

[1407] In a preferred embodiment, a nucleic acid fragment includes at least one, preferably more, nucleotides from the sequence of nucleotide 1 to 1794 or nucleotide 2614 to 2798 of SEQ ID NO:22.

[1408] In a preferred embodiment, a nucleic acid fragment includes at least one, preferably more, nucleotides from the sequence of nucleotide 1 to 2258 or nucleotide 2253 to 2798 of SEQ ID NO:22.

[1409] In a preferred embodiment, a nucleic acid fragment includes at least one, preferably more, nucleotides from the sequence of nucleotide 1 to 25 or nucleotide 424 to 2798 of SEQ ID NO:22.

[1410] In a preferred embodiment, a nucleic acid fragment includes at least one, preferably more, nucleotides from the sequence of nucleotide 1 to 387 or nucleotide 531 to 2798 of SEQ ID NO:22.

[1411] 58224 Nucleic Acid Variants

[1412] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid that encodes the same 58224 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence that differs by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues than that shown in SEQ ID NO:23. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions, insertions, or mismatches, are considered differences.

[1413] Nucleic acids of the invention can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system (e.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or chinese hamster ovary (CHO) cells).

[1414] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non-naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions, and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared with the encoded product).

[1415] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:22 or 24, or the sequence in ATCC Accession Number ______, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. If necessary for this analysis, the sequences should be aligned for maximum homology. “Looped” out sequences from deletions, insertions, or mismatches, are considered differences.

[1416] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the amino acid sequence shown in SEQ ID NO:23 or SEQ ID NO:26 or a fragment of this sequence. Such nucleic acid molecules can be obtained as being able to hybridize under a stringent hybridization condition as described herein, to the nucleotide sequence shown in SEQ ID NO:22 or 24 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 58224 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 58224 gene. Preferred variants include those that are correlated with helicase activity, e.g., nucleic acid binding, e.g., RNA binding, activity, or nucleic acid dependent NTPase activity, e.g., RNA dependent ATPase activity.

[1417] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the amino acid sequence shown in SEQ ID NO:23 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under stringent conditions, to the nucleotide sequence shown in SEQ ID NO:22 or 24 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 58224 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 58224 gene. Preferred variants include those that are correlated with helicase activity, e.g., nucleic acid binding, e.g., RNA binding, activity, or nucleic acid dependent NTPase activity, e.g., RNA dependent ATPase activity.

[1418] Allelic variants of 58224, e.g., human 58224, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 58224 protein within a population that maintain the ability to perform a helicase activity. Functional allelic variants typically will contain only conservative substitution of one or more amino acids of SEQ ID NO:23, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 58224, e.g., human 58224, protein within a population that do not have helicase activity. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:23, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[1419] Moreover, nucleic acid molecules encoding other 58224 family members and, thus have a nucleotide sequence that differs from the 58224 sequences of SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ are intended to be within the scope of the invention.

[1420] Antisense Nucleic Acid Molecules, Ribozymes and Modified 58224 Nucleic Acid Molecules

[1421] In another aspect, the invention features, an isolated nucleic acid molecule that is antisense to 58224. An “antisense” nucleic acid can include a nucleotide sequence that is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 58224 coding strand, or to only a portion thereof (e.g., the coding region of 58224 corresponding to SEQ ID NO:24). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 58224 (e.g., the 5′ and 3′untranslated regions).

[1422] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 58224 mRNA, but more preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding region of 58224 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 58224 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[1423] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions with procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[1424] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 58224 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong polymerase II or polymerase III promoter are preferred.

[1425] In yet another embodiment, the antisense nucleic acid molecule of the invention is an α-anomeric nucleic acid molecule. An α-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual β-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[1426] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 58224-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 58224 cDNA disclosed herein (i.e., SEQ ID NO:22, or 24), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 58224-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 58224 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel and Szostak (1993) Science 261:1411-1418.

[1427] 58224 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 58224 (e.g., the 58224 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 58224 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6(6):569-84; Helene, C. et al. (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14(12):807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[1428] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or colorimetric.

[1429] A 58224 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4 (1): 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra; Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[1430] PNAs of 58224 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 58224 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[1431] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[1432] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region that is complementary to a 58224 nucleic acid of the invention. The molecular beacon primer and probe molecules also have two complementary regions, one having a fluorophore and one having a quencher, such that the molecular beacon is useful for quantitating the presence of a 58224 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[1433] Isolated 58224 Polypeptides

[1434] In another aspect, the invention features an isolated 58224 protein or fragment thereof, e.g., a biologically active portion for use as immunogens or antigens to raise or test (or more generally to bind) anti-58224 antibodies. 58224 protein can be isolated from cells or tissue sources using standard protein purification techniques. 58224 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[1435] Polypeptides of the invention include those that arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and postranslational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same postranslational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of postranslational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[1436] In a preferred embodiment, a 58224 polypeptide has one or more of the following characteristics:

[1437] (i) it has the ability to promote the separation of two complementary strands of a duplex nucleic acid;

[1438] (ii) it has a molecular weight (e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications), amino acid composition or other physical characteristic of a 58224 polypeptide, e.g., a polypeptide of SEQ ID NO:23;

[1439] (iii) it has an overall sequence similarity of at least 60%, more preferably at least 70, 80, 90, or 95%, with a polypeptide of SEQ ID NO:23;

[1440] (iv) it has a SNF2 N-terminal domain which has sequence similarity preferably about 70%, 80%, 90% or 95% with amino acid residues 226-577 of SEQ ID NO:23;

[1441] (iv) it has a helicase conserved C-terminal domain, which has sequence similarity of preferably about 70%, 80%, 90% or 95% sequence similarity with amino acid residues 629-712 of SEQ ID NO:23);

[1442] (v) it has a lymphoid specific lymphocyte specific helicase domain 1, which has sequence similarity of preferably about 70%, 80%, 90% or 95% sequence similarity with amino acid residues 492-590 of SEQ ID NO:23);

[1443] (vi) it has a lymphoid specific lymphocyte specific helicase domain 2, which has sequence similarity of preferably about 70%, 80%, 90% or 95% with amino acid residues 775-838 of SEQ ID NO:23);

[1444] (vii) it has at least 2, preferably 4, and most preferably 9 of the cysteines found amino acid sequence of the native protein;

[1445] (viii) it has the ability to bind a nucleic acid, e.g., RNA or DNA; or

[1446] (ix) it has an NTPase activity, e.g., a nucleic acid dependent ATPase activity.

[1447] In a preferred embodiment, the 58224 protein or fragment thereof differs from the corresponding sequence in SEQ ID NO:23. In one embodiment, it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another embodiment, it differs from the corresponding sequence in SEQ ID NO:23 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:23. (If this comparison requires alignment, the sequences should be aligned for maximum homology. “Looped” out sequences from deletions, insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non-essential residue or a conservative substitution. In a preferred embodiment, the differences are not in a helicase domain. In another preferred embodiment one or more differences are at non-active site residues, e.g., amino acids 1-226, 578 to 628, or 713 to 838 of SEQ ID NO:23.

[1448] Other embodiments include a protein that contains one or more changes in amino acid sequence, e.g., a change in an amino acid residue that is not essential for activity. Such 58224 proteins differ in amino acid sequence from SEQ ID NO:23, yet retain biological activity.

[1449] In one embodiment, the protein includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or more homologous to SEQ ID NO:23.

[1450] In another embodiment, the protein includes an amino acid sequence at least 106 amino acids in length, and about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, homologous to SEQ ID NO:23.

[1451] In a preferred embodiment, the protein includes an amino acid sequence at least 273 amino acids in length, and about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, homologous to SEQ ID NO:23.

[1452] In another embodiment, a 58224 protein or fragment has an amino acid sequence which differs from the amino acid sequence encoded by the nucleotide sequence of Genbank Accession Number AK001201 or from the amino acid sequence of Genbank Accession Number BAA91550 by at least one, two, three, five or more amino acids. The variations may include the addition, replacement, and/or deletion of amino acid residues.

[1453] In another embodiment, a 58224 protein fragment has an amino acid sequence which contains one, preferably more, residues from the sequence of residues 1-563 or 837-838 of SEQ ID NO:23.

[1454] In another embodiment, a 58224 protein fragment has an amino acid sequence which contains one, preferably more, residues from the sequence of residues 106-838 of SEQ ID NO:23.

[1455] A 58224 protein or fragment is provided which varies from the sequence of SEQ ID NO:23 in non-active site residues by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment, but which does not differ from SEQ ID NO:23 or SEQ ID NO:26 in regions having a helicase activity. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions, insertions, or mismatches, are considered differences.) In some embodiments, the difference is at a non-essential residue or is a conservative substitution, while in others, the difference is at an essential residue or is a non conservative substitution.

[1456] In one embodiment, a biologically active portion of a 58224 protein includes a helicase domain, e.g., a SNF2 domain, or a conserved C-terminal helicase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 58224 protein.

[1457] In a preferred embodiment, the 58224 protein has an amino acid sequence shown in SEQ ID NO:23. In other embodiments, the 58224 protein is substantially identical to SEQ ID NO:23. In yet another embodiment, the 58224 protein is substantially identical to SEQ ID NO:23 and retains the functional activity of the protein of SEQ ID NO:23, as described in detail in subsection I above. Accordingly, in another embodiment, the 58224 protein is a protein which includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 94%. 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO:23.

[1458] 58224 Chimeric or Fusion Proteins

[1459] In another aspect, the invention provides 58224 chimeric or fusion proteins. As used herein, a 58224 “chimeric protein” or “fusion protein” includes a 58224 polypeptide linked to a non-58224 polypeptide. A “non-58224 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein that is not substantially homologous to the 58224 protein, e.g., a protein that is different from the 58224 protein and that is derived from the same or a different organism. The 58224 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 58224 amino acid sequence. In a preferred embodiment, a 58224 fusion protein includes at least one (e.g., two) biologically active portion of a 58224 protein. The non-58224 polypeptide can be fused to the N-terminus or C-terminus of a 58224 polypeptide.

[1460] The fusion protein can include a moiety that has high affinity for a ligand, e.g., a helicase substrate. For example, the fusion protein can be a GST-58224 fusion protein in which the 58224 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 58224. Alternatively, the fusion protein can be a 58224 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 58224 can be increased through use of a heterologous signal sequence.

[1461] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[1462] The 58224 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 58224 fusion proteins can be used to affect the bioavailability of a 58224 substrate. 58224 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example: (i) aberrant modification or mutation of a gene encoding a 58224 protein; (ii) misregulation of the 58224 gene; and (iii) aberrant post-translational modification of a 58224 protein.

[1463] Moreover, 58224-fusion proteins of the invention can be used as immunogens to produce anti-58224 antibodies in a subject, to purify 58224 ligands, and in screening assays to identify molecules that inhibit the interaction of 58224 with a 58224 substrate.

[1464] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 58224-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 58224 protein.

[1465] Variants of 58224 Proteins

[1466] In another aspect, the invention features a variant of a 58224 polypeptide, e.g., a polypeptide that functions as an agonist (mimetic) or as an antagonist of 58224 activities. Variants of the 58224 proteins can be generated by mutagenesis, e.g., discrete point mutations, the insertion or deletion of sequences or the truncation of a 58224 protein. An agonist of the 58224 protein retains substantially the same, or a subset, of the biological activities of the naturally occurring form of a 58224 protein. An antagonist of a 58224 protein can inhibit one or more of the activities of the naturally occurring form of the 58224 protein by, for example, competitively modulating a 58224-mediated activity of a 58224 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 58224 protein.

[1467] Variants of a 58224 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 58224 protein for agonist or antagonist activity.

[1468] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 58224 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 58224 protein.

[1469] Variants in which a cysteine residue is added or deleted or in which a residue that is glycosylated is added or deleted are particularly preferred.

[1470] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with screening assays to identify 58224 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6(3):327-331).

[1471] Cell based assays can be exploited to analyze a variegated 58224 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line which ordinarily responds to 58224 in a substrate-dependent manner. The transfected cells are then contacted with 58224 and the effect of the expression of the mutant on signaling by a 58224 substrate can be detected, e.g., by measuring helicase activity, e.g., a helicase activity described herein. Plasmid DNA can then be recovered from the cells that score for inhibition, or alternatively, potentiation of signaling by the 58224 substrate, and the individual clones further characterized.

[1472] In another aspect, the invention features a method of making a 58224 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 58224 polypeptide, e.g., a naturally occurring 58224 polypeptide. The method includes: altering the sequence of a 58224 polypeptide, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain, or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[1473] In another aspect, the invention features a method of making a fragment or analog of a 58224 polypeptide that retains at least one biological activity of a naturally occurring 58224 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 58224 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[1474] Anti-58224 Antibodies

[1475] In another aspect, the invention provides an anti-58224 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[1476] The anti-58224 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[1477] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 Kd or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 Kd or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[1478] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 58224 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-58224 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)₂ fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[1479] The anti-58224 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[1480] Phage display and combinatorial methods for generating anti-58224 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J. 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[1481] In one embodiment, the anti-58224 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Methods of producing rodent antibodies are known in the art.

[1482] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[1483] An anti-58224 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[1484] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[1485] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 58224 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[1486] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[1487] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 58224 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[1488] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[1489] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[1490] In preferred embodiments an antibody can be made by immunizing with purified 58224 antigen, or a fragment thereof, e.g., a fragment described herein.

[1491] A full-length 58224 protein or, antigenic peptide fragment of 58224 can be used as an immunogen or can be used to identify anti-58224 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 58224 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:23 or SEQ ID NO:26 and encompass an epitope of 58224. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[1492] Fragments of 58224 which include residues about 150-160 can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody, antibodies against hydrophilic regions of the 58224 protein. Similarly, fragments of 58224 which include residues 255-265 can be used to make an antibody against a hydrophobic region of the 58224 protein; a fragment of 58224 which includes residues about 226 to 577 can be used to make an antibody against the SNF2 region of the 58224 protein, and a fragment of 58224 which includes residues about 629 to 712 can be used to make an antibody against the C-terminal conserved helicase domain of the 58224 protein.

[1493] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[1494] Antibodies which bind only native 58224 protein, only denatured or otherwise non-native 58224 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Confromational epitopes can sometimes be identified by indentifying antibodies which bind to native but not denatured 58224 protein.

[1495] Preferred epitopes encompassed by the antigenic peptide are regions of 58224 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 58224 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 58224 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[1496] In preferred embodiments antibodies can bind one or more of purified antigen; tissue, e.g., tissue sections; whole cells, preferably living cells; lysed cells; cell fractions.

[1497] The anti-58224 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 58224 protein.

[1498] In a preferred embodiment the antibody has: effector function; and can fix complement. In other embodiments the antibody does not; recruit effector cells; or fix complement.

[1499] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example., it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[1500] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diptheria toxin or active fragment hereof, or a radionuclide, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[1501] An anti-58224 antibody (e.g., monoclonal antibody) can be used to isolate 58224 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-58224 antibody can be used to detect 58224 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-58224 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or ³H.

[1502] The invention also includes a nucleic acid that encodes an anti-58224 antibody, e.g., an anti-58224 antibody described herein. Also included are vectors which include the nucleic acid and cells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[1503] The invention also includes cell lines, e.g., hybridomas, which make an anti-58224 antibody, e.g., and antibody described herein, and method of using said cells to make a 58224 antibody.

[1504] 58224 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[1505] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[1506] A vector can include a 58224 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 58224 proteins, mutant forms of 58224 proteins, fusion proteins, and the like).

[1507] The recombinant expression vectors of the invention can be designed for expression of 58224 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990). Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[1508] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[1509] Purified fusion proteins can be used in 58224 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 58224 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six (6) weeks).

[1510] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990) 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[1511] The 58224 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[1512] When used in mammalian cells, the expression vector's control functions are often provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[1513] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the α-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[1514] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus. For a discussion of the regulation of gene expression using antisense genes see Weintraub, H. et al., Antisense RNA as a molecular tool for genetic analysis, Reviews—Trends in Genetics, Vol. 1(1)1986.

[1515] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 58224 nucleic acid molecule within a recombinant expression vector or a 58224 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[1516] A host cell can be any prokaryotic or eukaryotic cell. For example, a 58224 protein can be expressed in bacterial cells such as E. coli, insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are known to those skilled in the art.

[1517] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation

[1518] A host cell of the invention can be used to produce (i.e., express) a 58224 protein. Accordingly, the invention further provides methods for producing a 58224 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 58224 protein has been introduced) in a suitable medium such that a 58224 protein is produced. In another embodiment, the method further includes isolating a 58224 protein from the medium or the host cell.

[1519] In another aspect, the invention features, a cell or purified preparation of cells which include a 58224 transgene, or which otherwise misexpress 58224. The cell preparation can consist of human or non human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 58224 transgene, e.g., a heterologous form of a 58224, e.g., a gene derived from humans (in the case of a non-human cell). The 58224 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene which misexpress an endogenous 58224, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders which are related to mutated or mis-expressed 58224 alleles or for use in drug screening.

[1520] In another aspect, the invention features, a human cell, e.g., a lymphoid cell, transformed with nucleic acid which encodes a subject 58224 polypeptide.

[1521] Also provided are cells, preferably human cells, e.g., human lympoid or fibroblast cells, in which an endogenous 58224 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 58224 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 58224 gene. For example, an endogenous 58224 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[1522] 58224 Transgenic Animals

[1523] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 58224 protein and for identifying and/or evaluating modulators of 58224 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangment, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 58224 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[1524] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 58224 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 58224 transgene in its genome and/or expression of 58224 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 58224 protein can further be bred to other transgenic animals carrying other transgenes.

[1525] 58224 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[1526] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[1527] Uses of 58224

[1528] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: (a) screening assays; (b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and (c) methods of treatment (e.g., therapeutic and prophylactic). The isolated nucleic acid molecules of the invention can be used, for example, to express a 58224 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 58224 mRNA (e.g., in a biological sample) or a genetic alteration in a 58224 gene, and to modulate 58224 activity, as described further below. The 58224 proteins can be used to treat disorders characterized by insufficient or excessive production of a 58224 substrate or production of 58224 inhibitors. In addition, the 58224 proteins can be used to screen for naturally occurring 58224 substrates, to screen for drugs or compounds that modulate 58224 activity, as well as to treat disorders characterized by insufficient or excessive production of 58224 protein or production of 58224 protein forms which have decreased, aberrant or unwanted activity compared to 58224 wild type protein (e.g., imbalance of helicase activity, leading to an increase or decrease in cell proliferation, differentiation, or neoplastic transformation). Moreover, the anti-58224 antibodies of the invention can be used to detect and isolate 58224 proteins, regulate the bioavailability of 58224 proteins, and modulate 58224 activity.

[1529] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 58224 polypeptide is provided. The method includes: contacting the compound with the subject 58224 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind, to form a complex with, or to enzymatically act upon, the subject 58224 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with a subject 58224 polypeptide. It can also be used to find natural or synthetic inhibitors of a subject 58224 polypeptide. Screening methods are discussed in more detail below.

[1530] 58224 Screening Assays

[1531] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) that bind to 58224 proteins, have a stimulatory or inhibitory effect on, for example, 58224 expression or 58224 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 58224 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 58224 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[1532] In one embodiment, the invention provides assays for screening candidate or test compounds that are substrates of a 58224 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate the activity of a 58224 protein or polypeptide or a biologically active portion thereof.

[1533] In any screening assay, a 58224 polypeptide that may have, e.g., a helicase domain, can be used.

[1534] Assays for helicase activity can include nucleic acid binding activity, e.g., RNA or DNA binding activity; NTPase activity, e.g., RNA or DNA dependent ATPase activity; nucleic acid unwinding activity, e.g., RNA or DNA unwinding activity. Such assays are described in, e.g., Luking et al. (1998) Crit. Rev. Biochem. and Mol. Biol. 33:259-296, and/or in the references cited therein.

[1535] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries [libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive] (see, e.g., Zuckermann, R. N. et al. J. Med. Chem. 1994, 37: 2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam, K. S. (1997) Anticancer Drug Des. 12:145).

[1536] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and in Gallop et al. (1994) J. Med. Chem. 37:1233.

[1537] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria or spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390); (Devlin (1990) Science 249:404-406); (Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382); (Felici (1991) J. Mol. Biol. 222:301-310); (Ladner supra.).

[1538] In one embodiment, an assay is a cell-based assay in which a cell that expresses a 58224 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 58224 activity is determined. Determining the ability of the test compound to modulate 58224 activity can be accomplished by monitoring, for example, helicase activity, e.g., a helicase activity described herein. The cell, for example, can be of mammalian origin, e.g., human.

[1539] The ability of the test compound to modulate 58224 binding to a compound, e.g., a 58224 substrate, or to bind to 58224 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 58224 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 58224 can be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 58224 binding to a 58224 substrate in a complex. For example, compounds (e.g., 58224 substrates) can be labeled with ¹²⁵I, ³⁵S, ¹⁴C, or ³H, either directly or indirectly, and the radioisotope detected by direct counting of radioemission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[1540] The ability of a compound (e.g., a 58224 substrate or modulator) to interact with 58224 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 58224 without the labeling of either the compound or 58224. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 58224.

[1541] In yet another embodiment, a cell-free assay is provided in which a 58224 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 58224 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 58224 proteins to be used in assays of the present invention include fragments that participate in interactions with non-58224 molecules, e.g., fragments with high surface probability scores.

[1542] Soluble and/or membrane-bound forms of isolated proteins (e.g., 58224 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)_(n), 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[1543] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[1544] Assays where ability of agent to block helicase activity within a cell is evaluated.

[1545] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[1546] In another embodiment, determining the ability of the 58224 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal that can be used as an indication of real-time reactions between biological molecules.

[1547] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[1548] It may be desirable to immobilize either 58224, an anti 58224 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 58224 protein, or interaction of a 58224 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/58224 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 58224 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 58224 binding or activity determined using standard techniques.

[1549] Other techniques for immobilizing either a 58224 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 58224 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[1550] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[1551] In one embodiment, this assay is performed utilizing antibodies reactive with 58224 protein or target molecules but which do not interfere with binding of the 58224 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 58224 protein is trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 58224 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 58224 protein or target molecule.

[1552] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci Aug; 18(8):284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit Winter; 11 (1-6):141-8; Hage, D. S., and Tweed, S. A. (1997) J. Chromatogr B. Biomed Sci Appl Oct 10;699(1-2):499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[1553] In a preferred embodiment, the assay includes contacting the 58224 protein or biologically active portion thereof with a known compound which binds 58224 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 58224 protein, wherein determining the ability of the test compound to interact with a 58224 protein includes determining the ability of the test compound to preferentially bind to 58224 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[1554] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 58224 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 58224 protein through modulation of the activity of a downstream effector of a 58224 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[1555] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), e.g., a substrate, a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[1556] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[1557] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partners, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[1558] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes that have formed remain immobilized on the solid surface. In assays where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. In assays where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[1559] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound. Reaction products are separated from unreacted components and complexes detected using, for example, an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex formation or that disrupt preformed complexes can be identified.

[1560] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in which either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[1561] In yet another aspect, the 58224 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 58224 (“58224-binding proteins” or “58224-bp”) and are involved in 58224 activity. Such 58224-bps can be activators or inhibitors of signals by the 58224 proteins or 58224 targets as, for example, downstream elements of a 58224-mediated signaling pathway.

[1562] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 58224 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence from a library of DNA sequences that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the 58224 protein can be fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact in vivo and form a 58224-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., LacZ) that is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene that encodes the protein that interacts with the 58224 protein.

[1563] In another embodiment, modulators of 58224 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 58224 mRNA or protein evaluated relative to the level of expression of 58224 mRNA or protein in the absence of the candidate compound. When expression of 58224 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 58224 mRNA or protein expression. Alternatively, when expression of 58224 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 58224 mRNA or protein expression. The level of 58224 mRNA or protein expression can be determined by methods described herein for detecting 58224 mRNA or protein.

[1564] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 58224 protein can be confirmed in vivo, e.g., in an animal model.

[1565] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 58224 modulating agent, an antisense 58224 nucleic acid molecule, a 58224-specific antibody, or a 58224-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[1566] 58224 Detection Assays

[1567] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 58224 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[1568] 58224 Chromosome Mapping

[1569] The 58224 nucleotide sequences or portions thereof can be used to map the location of the 58224 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 58224 sequences with genes associated with disease.

[1570] Briefly, 58224 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 58224 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 58224 sequences will yield an amplified fragment.

[1571] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes and a full set of mouse chromosomes, allows easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[1572] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 58224 to a chromosomal location.

[1573] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques (Pergamon Press, New York 1988).

[1574] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[1575] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the sane chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[1576] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 58224 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[1577] 58224 Tissue Typing

[1578] 58224 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., by electrophoresis and Southern blotted, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[1579] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 58224 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[1580] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:22 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers, which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:24 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[1581] If a panel of reagents from 58224 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[1582] Use of Partial 58224 Sequences in Forensic Biology

[1583] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen, found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[1584] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e., another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:22 (e.g., fragments derived from the noncoding regions of SEQ ID NO:22 and having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[1585] The 58224 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue, e.g., a tissue containing 58224 helicase activity. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 58224 probes can be used to identify tissue by species and/or by organ type.

[1586] In a similar fashion, these reagents, e.g., 58224 primers or probes can be used to screen tissue culture for contamination (i.e., screen for the presence of a mixture of different types of cells in a culture).

[1587] Predictive Medicine of 58224

[1588] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[1589] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene that encodes 58224. Such disorders include, e.g., a disorder associated with the misexpression of 58224.

[1590] The method includes one or more of the following:

[1591] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 58224 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[1592] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 58224 gene;

[1593] detecting, in a tissue of the subject, the misexpression of the 58224 gene at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[1594] detecting, in a tissue of the subject, the misexpression of the gene at the protein level, e.g., detecting a non-wild type level of a 58224 polypeptide.

[1595] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 58224 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, or a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[1596] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence that hybridizes to a sense or antisense sequence from SEQ ID NO:22, 24, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 58224 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and (iii) detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[1597] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 58224 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 58224.

[1598] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[1599] In preferred embodiments the method includes determining the structure of a 58224 gene, an abnormal structure being indicative of risk for the disorder.

[1600] In preferred embodiments the method includes contacting a sample form the subject with an antibody to the 58224 protein or a nucleic acid, which hybridizes specifically with the gene. This and other embodiments are discussed below.

[1601] Diagnostic and Prognostic Assays of 58224

[1602] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 58224 molecules and for identifying variations and mutations in the sequence of 58224 molecules.

[1603] Expression Monitoring and Profiling:

[1604] The presence, level, or absence of 58224 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 58224 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 58224 protein such that the presence of 58224 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 58224 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 58224 genes; measuring the amount of protein encoded by the 58224 genes; or measuring the activity of the protein encoded by the 58224 genes.

[1605] The level of mRNA corresponding to the 58224 gene in a cell can be determined both by in situ and by in vitro formats.

[1606] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 58224 nucleic acid, such as the nucleic acid of SEQ ID NO:22 or 24, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 58224 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[1607] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 58224 genes.

[1608] The level of mRNA in a sample that is encoded by one of 58224 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[1609] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 58224 gene being analyzed.

[1610] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 58224 mRNA, or genomic DNA, and comparing the presence of 58224 mRNA or genomic DNA in the control sample with the presence of 58224 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 58224 transcript levels.

[1611] A variety of methods can be used to determine the level of protein encoded by 58224. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)₂) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[1612] The detection methods can be used to detect 58224 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 58224 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 58224 protein include introducing into a subject a labeled anti-58224 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-58224 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[1613] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 58224 protein, and comparing the presence of 58224 protein in the control sample with the presence of 58224 protein in the test sample.

[1614] The invention also includes kits for detecting the presence of 58224 in a biological sample. For example, the kit can include a compound or agent capable of detecting 58224 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 58224 protein or nucleic acid.

[1615] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[1616] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[1617] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 58224 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as pain or deregulated cell proliferation.

[1618] In one embodiment, a disease or disorder associated with aberrant or unwanted 58224 expression or activity is identified. A test sample is obtained from a subject and 58224 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 58224 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 58224 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[1619] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 58224 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a cell proliferation or differentiation disorder, e.g., cancer (e.g., breast, ovary, lung, or colon cancer, or liver metastasis), Werner syndrome, Bloom syndrome, cockayne's syndrome, xerodema pigmentosum, lymphoid proliferative diseases, α-Thalassemia X-linked mental retardation, or another cell proliferation or differentiation disorder as described herein.

[1620] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 58224 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 58224 (e.g., other genes associated with a 58224-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[1621] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 58224 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a cell proliferation or differentiation disorder, e.g., Werner syndrome, Bloom syndrome, cockayne's syndrome, xerodema pigmentosum, lymphoid proliferative diseases, cancer or α-Thalassemia X-linked mental retardation, in a subject wherein altered 58224 expression is an indication that the subject has or is disposed to having a cell proliferation or differentiation disorder as described herein. The method can be used to monitor a treatment for a cell proliferation or differentiation disorder, e.g., cancer (e.g., breast, ovary, lung, or colon cancer, or liver metastasis), Werner syndrome, Bloom syndrome, cockayne's syndrome, xerodema pigmentosum, lymphoid proliferative diseases, α-Thalassemia X-linked mental retardation, or another cell proliferation or differentiation disorder as described herein. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[1622] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 58224 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[1623] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 58224 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[1624] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[1625] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 58224 expression.

[1626] 58224 Arrays and Uses Thereof

[1627] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 58224 molecule (e.g., a 58224 nucleic acid or a 58224 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm², and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[1628] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 58224 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 58224. Each address of the subset can include a capture probe that hybridizes to a different region of a 58224 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 58224 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 58224 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 58224 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[1629] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[1630] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 58224 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 58224 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-58224 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[1631] In another aspect, the invention features a method of analyzing the expression of 58224. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 58224-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[1632] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 58224. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 58224. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[1633] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 58224 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[1634] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[1635] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 58224-associated disease or disorder; and processes, such as a cellular transformation associated with a 58224-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 58224-associated disease or disorder

[1636] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 58224) that could serve as a molecular target for diagnosis or therapeutic intervention.

[1637] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 58224 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 58224 polypeptide or fragment thereof. For example, multiple variants of a 58224 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[1638] The polypeptide array can be used to detect a 58224 binding compound, e.g., an antibody in a sample from a subject with specificity for a 58224 polypeptide or the presence of a 58224-binding protein or ligand.

[1639] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 58224 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[1640] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 58224 or from a cell or subject in which a 58224 mediated response has been elicited, e.g., by contact of the cell with 58224 nucleic acid or protein, or administration to the cell or subject 58224 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 58224 (or does not express as highly as in the case of the 58224 positive plurality of capture probes) or from a cell or subject which in which a 58224 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 58224 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[1641] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 58224 or from a cell or subject in which a 58224-mediated response has been elicited, e.g., by contact of the cell with 58224 nucleic acid or protein, or administration to the cell or subject 58224 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 58224 (or does not express as highly as in the case of the 58224 positive plurality of capture probes) or from a cell or subject which in which a 58224 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[1642] In another aspect, the invention features a method of analyzing 58224, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 58224 nucleic acid or amino acid sequence; comparing the 58224 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 58224.

[1643] Detection of 58224 Variations or Mutations

[1644] The methods of the invention can also be used to detect genetic alterations in a 58224 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 58224 protein activity or nucleic acid expression, such as a cell proliferation or differentiation disorder, e.g., cancer (e.g., breast, ovary, lung, or colon cancer, or liver metastasis), Werner syndrome, Bloom syndrome, cockayne's syndrome, xerodema pigmentosum, lymphoid proliferative diseases, α-Thalassemia X-linked mental retardation, or another cell proliferation or differentiation disorder as described herein. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 58224-protein, or the mis-expression of the 58224 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 58224 gene; 2) an addition of one or more nucleotides to a 58224 gene; 3) a substitution of one or more nucleotides of a 58224 gene, 4) a chromosomal rearrangement of a 58224 gene; 5) an alteration in the level of a messenger RNA transcript of a 58224 gene, 6) aberrant modification of a 58224 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 58224 gene, 8) a non-wild type level of a 58224-protein, 9) allelic loss of a 58224 gene, and 10) inappropriate post-translational modification of a 58224-protein.

[1645] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 58224-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 58224 gene under conditions such that hybridization and amplification of the 58224-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[1646] In another embodiment, mutations in a 58224 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[1647] In other embodiments, genetic mutations in 58224 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 58224 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 58224 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 58224 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[1648] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 58224 gene and detect mutations by comparing the sequence of the sample 58224 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[1649] Other methods for detecting mutations in the 58224 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[1650] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 58224 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[1651] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 58224 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 58224 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[1652] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[1653] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[1654] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[1655] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 58224 nucleic acid.

[1656] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:22 or 24, or the complement of SEQ ID NO:22 or 24. Different locations can be different but overlapping or or nonoverlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[1657] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 58224. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[1658] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the T_(m) of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[1659] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 58224 nucleic acid.

[1660] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 58224 gene.

[1661] Use of 58224 Molecules as Surrogate Markers

[1662] The 58224 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 58224 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 58224 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[1663] The 58224 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 58224 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-58224 antibodies may be employed in an immune-based detection system for a 58224 protein marker, or 58224-specific radiolabeled probes may be used to detect a 58224 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[1664] The 58224 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 58224 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 58224 DNA may correlate 58224 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[1665] Pharmaceutical Compositions of 58224

[1666] The nucleic acid and polypeptides, fragments thereof, as well as anti-58224 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[1667] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[1668] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[1669] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle that contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying, which yield a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[1670] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[1671] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser that contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[1672] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[1673] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[1674] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[1675] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[1676] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD₅₀ (the dose lethal to 50% of the population) and the ED₅₀ (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD₅₀/ED₅₀. Compounds that exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[1677] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED₅₀ with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC₅₀ (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[1678] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[1679] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[1680] The present invention encompasses agents that modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[1681] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 μg/kg to about 500 mg/kg, about 100 μg/kg to about 5 mg/kg, or about 1 μg/kg to about 50 μg/kg. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[1682] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive metal ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, and puromycin and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP)cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine and vinblastine).

[1683] The conjugates of the invention can be used for modifying a given biological response. The drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, α-interferon, β-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[1684] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[1685] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[1686] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[1687] Methods of Treatment for 58224

[1688] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 58224 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[1689] It is possible that some 58224 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms. Relevant disorders can include cell proliferation or differentiation disorders, e.g., cancer (e.g., breast, ovary, lung, or colon cancer, or liver metastasis), Werner syndrome, Bloom syndrome, cockayne's syndrome, xerodema pigmentosum, lymphoid proliferative diseases, α-Thalassemia X-linked mental retardation, or another cell proliferation or differentiation disorder as described herein.

[1690] As the 58224 molecules are expressed in breast, ovary, colon, and lung tumors, in liver metastases, and in fetal liver, these molecules can be used diagnostically and therapeutically to treat/diagnose proliferative diseases of the breast, ovary, colon, lung, and liver, as described herein above.

[1691] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics as described below.

[1692] The 58224 molecules can act as novel diagnostic targets and therapeutic agents for controlling cellular proliferative and/or differentiative disorders, which have been described above, as well as disorders associated with bone metabolism, brain disorders, the immune system, the cardiovascular system, the liver, viral diseases, and pain.

[1693] Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B₁) deficiency and vitamin B₁₂ deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[1694] Aberrant expression and/or activity of 58224 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 58224 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 58224 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 58224 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[1695] The 58224 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[1696] Examples of disorders involving the heart or “cardiovascular disorder” include, but are not limited to, a disease, disorder, or state involving the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. Examples of such disorders include hypertension, atherosclerosis, coronary artery spasm, congestive heart failure, coronary artery disease, valvular disease, arrhythmias, and cardiomyopathies.

[1697] Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix-accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[1698] Additionally, 58224 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 58224 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 58224 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[1699] Additionally, 58224 may play an important role in the regulation pain disorders. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[1700] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 58224 expression or activity, by administering to the subject 58224 or an agent that modulates 58224 expression or at least one 58224 activity. Subjects at risk for a disease that is caused or contributed to by aberrant or unwanted 58224 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 58224 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 58224 aberrance, for example, a 58224 agonist or 58224 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[1701] It is possible that some 58224 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[1702] As discussed above, successful treatment of 58224 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using assays described above, that exhibits negative modulatory activities, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 58224 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)₂ and FAb expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[1703] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix-molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[1704] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in which the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[1705] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 58224 expression is through the use of aptamer molecules specific for 58224 protein. Aptamers are nucleic acid molecules having a tertiary structure that permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. 1997 Curr. Opin. Chem Biol. 1(1): 5-9; and Patel, D. J. 1997 Curr Opin Chem Biol Jun; 1(1):32-46). Since nucleic acid molecules may in many cases, be more conveniently introduced into target cells than therapeutic protein molecules, aptamers offer a method by which 58224 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[1706] Antibodies can be generated that are both specific for target gene products and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 58224 disorders. For a description of antibodies, see the Antibody section above.

[1707] In circumstances wherein injection of an animal or a human subject with a 58224 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 58224 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. 1999 Ann Med 31(1):66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. 1998 Cancer Treat Res 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 58224 protein. Vaccines directed to a disease characterized by 58224 expression may also be generated in this fashion.

[1708] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[1709] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 58224 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders.

[1710] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD₅₀ and the ED₅₀ as described above in the Pharmaceutical Composition section.

[1711] Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. A compound that is able to modulate 58224 activity is used as a template or “imprinting molecule,” to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix that contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 58224 can be readily monitored and used in calculations of IC₅₀.

[1712] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC₅₀. A rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[1713] Another aspect of the invention pertains to methods of modulating 58224 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with 58224 or agent that modulates one or more of the activities of 58224 protein activity associated with the cell. An agent that modulates 58224 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 58224 protein (e.g., a 58224 substrate or receptor), a 58224 antibody, a 58224 agonist or antagonist, a peptidomimetic of a 58224 agonist or antagonist, or other small molecule.

[1714] In one embodiment, the agent stimulates one or more 58224 activities. Examples of such stimulatory agents include active 58224 protein and a nucleic acid molecule encoding 58224. In another embodiment, the agent inhibits one or more 58224 activities. Examples of such inhibitory agents include antisense 58224 nucleic acid molecules, anti-58224 antibodies, and 58224 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 58224 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up-regulates or down-regulates) 58224 expression or activity. In another embodiment, the method involves administering a 58224 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 58224 expression or activity.

[1715] Stimulation of 58224 activity is desirable in situations in which 58224 is abnormally down-regulated and/or in which increased 58224 activity is likely to have a beneficial effect. For example, stimulation of 58224 activity is desirable in situations in which a 58224 is down-regulated and/or in which increased 58224 activity is likely to have a beneficial effect. Likewise, inhibition of 58224 activity is desirable in situations in which 58224 is abnormally up-regulated and/or in which decreased 58224 activity is likely to have a beneficial effect.

[1716] 58224 Pharmacogenomics

[1717] The 58224 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 58224 activity (e.g., 58224 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 58224-associated disorders associated with aberrant or unwanted 58224 activity (e.g., hyperproliferative disorders, e.g., cancer). In conjunction with such treatment, pharmacogenomics may be considered. “Pharmacogenomics,” as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype,” or “drug response genotype.”) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 58224 molecules of the present invention or 58224 modulators according to that individual's drug response genotype.

[1718] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23(10-11) δ 983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43(2):254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally occurring polymorphisms.

[1719] Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 44576 molecule or 44576 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 44576 molecule or 44576 modulator.

[1720] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association,” relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high-resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[1721] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 58224 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[1722] Alternatively, a method termed “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 58224 molecule or 58224 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[1723] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 58224 molecule or 58224 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[1724] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 58224 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 58224 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., cancer cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[1725] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 58224 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 58224 gene expression, protein levels, or up-regulate 58224 activity, can be monitored in clinical trials of subjects exhibiting decreased 58224 gene expression, protein levels, or down-regulated 58224 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 58224 gene expression, protein levels, or down-regulate 58224 activity, can be monitored in clinical trials of subjects exhibiting increased 58224 gene expression, protein levels, or upregulated 58224 activity. In such clinical trials, the expression or activity of a 58224 gene, and preferably, other genes that have been implicated in, for example, a 58224-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[1726] 58224 Informatics

[1727] The sequence of a 58224 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 58224. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 58224 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[1728] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[1729] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[1730] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[1731] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[1732] Thus, in one aspect, the invention features a method of analyzing 58224, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 58224 nucleic acid or amino acid sequence; comparing the 58224 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 58224. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[1733] The method can include evaluating the sequence identity between a 58224 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[1734] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[1735] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[1736] Thus, the invention features a method of making a computer readable record of a sequence of a 58224 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[1737] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 58224 sequence, or record, in machine-readable form; comparing a second sequence to the 58224 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 58224 sequence includes a sequence being compared. In a preferred embodiment the 58224 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 58224 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[1738] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 58224-associated disease or disorder or a pre-disposition to a 58224-associated disease or disorder, wherein the method comprises the steps of determining 58224 sequence information associated with the subject and based on the 58224 sequence information, determining whether the subject has a 58224-associated disease or disorder or a pre-disposition to a 58224-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[1739] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 58224-associated disease or disorder or a pre-disposition to a disease associated with a 58224 wherein the method comprises the steps of determining 58224 sequence information associated with the subject, and based on the 58224 sequence information, determining whether the subject has a 58224-associated disease or disorder or a pre-disposition to a 58224-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 58224 sequence of the subject to the 58224 sequences in the database to thereby determine whether the subject as a 58224-associated disease or disorder, or a pre-disposition for such.

[1740] The present invention also provides in a network, a method for determining whether a subject has a 58224 associated disease or disorder or a pre-disposition to a 58224-associated disease or disorder associated with 58224, said method comprising the steps of receiving 58224 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 58224 and/or corresponding to a 58224-associated disease or disorder (e.g., a cell proliferation or differentiation disorder, e.g., cancer, e.g., breast, ovary, lung, or colon cancer, or liver metastasis; Werner syndrome; Bloom syndrome; cockayne's syndrome; xerodema pigmentosum; lymphoid proliferative diseases; α-Thalassemia X-linked mental retardation; or another cell proliferation or differentiation disorder as described herein), and based on one or more of the phenotypic information, the 58224 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 58224-associated disease or disorder or a pre-disposition to a 58224-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[1741] The present invention also provides a method for determining whether a subject has a 58224-associated disease or disorder or a pre-disposition to a 58224-associated disease or disorder, said method comprising the steps of receiving information related to 58224 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 58224 and/or related to a 58224-associated disease or disorder, and based on one or more of the phenotypic information, the 58224 information, and the acquired information, determining whether the subject has a 58224-associated disease or disorder or a pre-disposition to a 58224-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[1742] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 46980 Invention

[1743] Carboxylesterases are a structurally related family of proteins. Carboxylesterases have been classified into three categories (A, B and C) on the basis of differential patters of inhibition by organophosphates (Myers, M. et al. (1988) Mol. Biol. Evol. 5(2):113-119). Members of the type B carboxylesterase subfamily include, for example, mammalian cholinesterases and mammalian bile salt activated lipases. Generally enzymatically active carboxylesterases have a conserved catalytic triad of active site residues. However, some carboxylesterases lack at least one member of the triad and may not retain catalytic function. The neuroligins are among such carboxylesterases.

[1744] Neuroligins are cell surface molecules composed of approximately five domains: an N-terminal signal sequence, which can be cleaved, a large extracellular domain homologous to carboxylesterases, a linker domain between the transmembrane region and the carboxylesterase homology domain, a transmembrane region, and a cytoplasmic tail (Ichtchenko, K. et al. (1996) J. Biol. Chem. 271(5):2676-2682). Sequence comparisons place the neuroligins in the large family of esterase homology domain proteins that includes thyroglobulin, acetylcholinesterase, and gliotactin. However, neuroligins are only distantly related to these proteins, and thus appear to form a unique subset of the esterase family. At least three neuroligins have been cloned from rat, namely Neuroligins 1, 2 and 3 (Ichtchenko, K. et al. (1995) Cell 81:435-443; Ichtchenko, K. et al. (1996) supra). These three neuroligins are expressed at high levels in the brain, primarily in neurons. Neuroligins have been shown to mediate cell adhesion events associated with neuronal development and/or maintenance. For example, Neuroligin 1 has been found to be enriched in postsynaptic densities where it may recruit receptors, channels, and signal transduction molecules at synaptic sites of cell adhesion (Song et al. (1999) Proc. Natl. Acad. Sci. 96(3):1100-5).

[1745] Functionally neuroligins bind tightly, in a calcium-dependent manner, to the extracellular domains of the polymorphic cell surface proteins known as β-neurexins. Neurexins are neuronal cell surface proteins that exhibit a high degree of diversity (Ushkaryov et al. (1994) J. Biol. Chem. 269: 11987-11992). Neuroligin-β-neurexin interactions have been implicated in mediating recognition processes between neurons that give rise to neuronal developmental events such as synaptogenesis (e.g., specification of excitatory synapses) (Brose, N. (1999) Naturwissenschaften 86(11):516-24).

Summary of the 46980 Invention

[1746] The present invention is based, in part, on the discovery of a novel neuroligin family member, referred to herein as “46980”. The nucleotide sequence of a cDNA encoding 46980 is shown in SEQ ID NO:27, and the amino acid sequence of a 46980 polypeptide is shown in SEQ ID NO:28. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:29. See, e.g., Example 19, below.

[1747] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 46980 protein or polypeptide, e.g., a biologically active portion of the 46980 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:28. In other embodiments, the invention provides isolated 46980 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:27, SEQ ID NO:29, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:27, SEQ ID NO:29, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:27, SEQ ID NO:29, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 46980 protein or an active fragment thereof.

[1748] In a related aspect, the invention further provides nucleic acid constructs that include a 46980 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 46980 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 46980 nucleic acid molecules and polypeptides.

[1749] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 46980-encoding nucleic acids.

[1750] In still another related aspect, isolated nucleic acid molecules that are antisense to a 46980 encoding nucleic acid molecule are provided.

[1751] In another aspect, the invention features, 46980 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 46980-mediated or -related disorders. In another embodiment, the invention provides 46980 polypeptides having a 46980 activity. Preferred polypeptides are 46980 proteins including at least one carboxylesterase domain, and, preferably, having a 46980 activity, e.g., a neurexin-binding or other 46980 activity described herein.

[1752] In other embodiments, the invention provides 46980 polypeptides, e.g., a 46980 polypeptide having the amino acid sequence shown in SEQ ID NO:28 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:28 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:27, SEQ ID NO:29, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 46980 protein or an active fragment thereof.

[1753] In a related aspect, the invention further provides nucleic acid constructs which include a 46980 nucleic acid molecule described herein.

[1754] In a related aspect, the invention provides 46980 polypeptides or fragments operatively linked to non-46980 polypeptides to form fusion proteins.

[1755] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 46980 polypeptides or fragments thereof, e.g., an extracellular domain of a 46980 polypeptide. In one embodiment, the antibodies or antigen-binding fragment thereof competitively inhibit the binding of a second antibody to a 46980 polypeptide or a fragment thereof, e.g., an extracellular domain of a 46980 polypeptide such as a neurexin binding domain or a carboxylesterase domain.

[1756] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 46980 polypeptides or nucleic acids.

[1757] In still another aspect, the invention provides a process for modulating 46980 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 46980 polypeptides or nucleic acids, such as conditions involving a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder.

[1758] The invention also provides assays for determining the activity of or the presence or absence of 46980 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis.

[1759] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 46980 polypeptides or nucleic acids, e.g., agents (e.g., compounds) that modulate the normal pain response, aberrant or altered pain response, inflammatory response, or fertility (e.g., spermatid activity).

[1760] In a preferred embodiment, the effect of an agent, e.g., compound, on the pain response is evaluated by an analgesic test, e.g., the hot plate test, tail flick test, writhing test, paw pressure test, all electric stimulation test, tail withdrawal test, or formalin test.

[1761] In a preferred embodiment, the compound inhibits a 46980 activity.

[1762] In another preferred embodiment, the agent, e.g., compound, modulates endogenous levels of a 46980 ligand, e.g., a neurexin, e.g., a β-neurexin.

[1763] In still another aspect, the invention provides a process for modulating 46980 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant, e.g., decreased or increased expression of the 46980 polypeptides or nucleic acids, such as conditions involving pain response, aberrant or altered pain response, pain related disorders, or fertility.

[1764] In still another aspect, the invention features a method of modulating (e.g., enhancing or inhibiting) the pain response. The method includes contacting a cell with an agent that modulates the activity or expression of a 46980 polypeptide or nucleic acid, in an amount effective to modulate the pain response.

[1765] In a preferred embodiment, the agent modulates (e.g., increases or decreases) a 46980, e.g., a protein-interaction function of a 46980 polypeptide, e.g., neurexin binding, e.g., β-neurxin binding.

[1766] In a preferred embodiment, the agent, e.g., the compound, modulates (e.g., increases or decreases) expression of the 46980 nucleic acid by, e.g., modulating transcription, mRNA stability, mRNA nuclear export, splicing, and so forth.

[1767] In preferred embodiments, the agent, e.g., the compound, is a peptide, a phosphopeptide, a small molecule, e.g., a member of a combinatorial library, or an antibody, or any combination thereof. The antibody can be conjugated to a therapeutic moiety selected from the group consisting of a cytotoxin, a cytotoxic agent and a radioactive metal ion.

[1768] In additional preferred embodiments, the agent is an antisense molecule, a ribozyme, a triple helix molecule, or a 46980 nucleic acid, or any combination thereof.

[1769] In a preferred embodiment, the agent, e.g., the compound, is administered in combination with a cytotoxic agent.

[1770] In a preferred embodiment, the cell, e.g., the 46980-expressing cell, is a central or peripheral nervous system cell, e.g., a cell in an area involved in pain control, e.g., a cell in the substantia gelatinosa of the spinal cord, or a cell in the periaqueductal gray matter.

[1771] In a preferred embodiment, the agent and the 46980-polypeptide or nucleic acid are contacted in vitro or ex vivo.

[1772] In a preferred embodiment, the contacting step is effected in vivo in a subject, e.g., as part of a therapeutic or prophylactic protocol. The contacting step(s) can be repeated.

[1773] Preferably, the subject is a human, e.g., a patient with pain or a pain-associated disorder disclosed herein. For example, the subject can be a patient with pain elicited from tissue injury, e.g., inflammation, infection, ischemia; pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches, e.g., migrane; pain associated with surgery; pain related to inflammation, e.g., irritable bowel syndrome; or chest pain. The subject can be a patient with complex regional pain syndrome (CRPS), reflex sympathetic dystrophy (RSD), causalgia, neuralgia, central pain and dysesthesia syndrome, carotidynia, neurogenic pain, refractory cervicobrachial pain syndrome, myofascial pain syndrome, craniomandibular pain dysfunction syndrome, chronic idiopathic pain syndrome, Costen's pain-dysfunction, acute chest pain syndrome, gynecologic pain syndrome, patellofemoral pain syndrome, anterior knee pain syndrome, recurrent abdominal pain in children, colic, low back pain syndrome, neuropathic pain, phantom pain from amputation, phantom tooth pain, or pain asymbolia. The subject can be a cancer patient, e.g., a patient with brain cancer, bone cancer, or prostate cancer. In other embodiments, the subject is a non-human animal, e.g., an experimental animal, e.g., an arthritic rat model of chronic pain, a chronic constriction injury (CCI) rat model of neuropathic pain, or a rat model of unilateral inflammatory pain by intraplantar injection of Freund's complete adjuvant (FCA).

[1774] In preferred embodiments, the agent, e.g., the compound, is a peptide, a phosphopeptide, a small molecule, e.g., a member of a combinatorial library, or an antibody, or any combination thereof. The antibody can be conjugated to a therapeutic moiety selected from the group consisting of a cytotoxin, a cytotoxic agent and a radioactive metal ion.

[1775] In additional preferred embodiments, the agent is an antisense, a ribozyme, or a triple helix molecule, or an 46980 nucleic acid, or any combination thereof.

[1776] In a preferred embodiment, the agent is administered in combination with a cytotoxic agent. The administration of the agent and/or protein can be repeated.

[1777] In still another aspect, the invention features a method of modulating (e.g., enhancing or inhibiting) spermatid activity or fertility. The method includes contacting a cell with an agent that modulates the activity or expression of a 46980 polypeptide or nucleic acid, in an amount effective to modulate fertility or spermatid activity.

[1778] In a preferred embodiment, the agent modulates (e.g., increases or decreases) a 46980 ligand binding activity, e.g., neurexin binding activity. In another preferred embodiment, the agent modulates (e.g., increases or decreases) expression of the 46980 nucleic acid by, e.g., modulating transcription, mRNA stability, mRNA nuclear export, splicing, and so forth.

[1779] In preferred embodiments, the agent is a peptide, a phosphopeptide, a small molecule, e.g., a member of a combinatorial library, or an antibody, or any combination thereof. The antibody can be conjugated to a therapeutic moiety selected from the group consisting of a cytotoxin, a cytotoxic agent and a radioactive metal ion.

[1780] In additional preferred embodiments, the agent is an antisense molecule, a ribozyme, a triple helix molecule, or an 46980 nucleic acid, or any combination thereof.

[1781] In a preferred embodiment, the agent is administered in combination with a cytotoxic agent.

[1782] In a preferred embodiment, the cell, e.g., the 46980-expressing cell, is a cell of the male reproductive system, e.g., a spermatid cell.

[1783] In a preferred embodiment, the agent and the 46980-polypeptide or nucleic acid are contacted in vitro or ex vivo.

[1784] In a preferred embodiment, the contacting step is effected in vivo in a subject, e.g., as part of a therapeutic or prophylactic protocol. Preferably, the subject is a human, e.g., a patient with infertility. The subject can be a cancer patient, e.g., a patient with prostate cancer. In other embodiments, the subject is a non-human animal, e.g., an experimental animal, e.g., a rodent model for infertility. In still other embodiment, the subject is a human or non-human animal for which contraception is intended. The contacting step(s) can be repeated.

[1785] In preferred embodiments, the agent is a peptide, a phosphopeptide, a small molecule, e.g., a member of a combinatorial library, or an antibody, or any combination thereof. The antibody can be conjugated to a therapeutic moiety selected from the group consisting of a cytotoxin, a cytotoxic agent and a radioactive metal ion.

[1786] In additional preferred embodiments, the agent is an antisense, a ribozyme, or a triple helix molecule, or a 46980 nucleic acid, or any combination thereof.

[1787] In a preferred embodiment, the agent is administered in combination with a cytotoxic agent.

[1788] The administration of the agent and/or protein can be repeated.

[1789] In still another aspect, the invention features a method for evaluating the efficacy of a treatment of a disorder, e.g., a disorder disclosed herein, in a subject. The method includes treating a subject with a protocol under evaluation; assessing the expression of a 46980 nucleic acid or 46980 polypeptide, such that a change in the level of 46980 nucleic acid or 46980 polypeptide after treatment, relative to the level before treatment, is indicative of the efficacy of the treatment of the disorder.

[1790] In a preferred embodiment, the disorder is a neuronal disorder. In a preferred embodiment, the disorder is a pain or a pain related disorder. In another preferred embodiment, the disorder is a neurodegenerative disorder. In still another preferred embodiment, the disorder is a neuronal connectivity-related disorder or a developmental disorder of the nervous system.

[1791] In a preferred embodiment, the disorder is infertility or aberrant spermatid cell activity.

[1792] In a preferred embodiment, the disorder is a cancer, e.g., prostate cancer, brain cancer, neurofibramatosis, or testicular cancer.

[1793] In a preferred embodiment, the subject is a human. In a preferred embodiment, the subject is an experimental animal, e.g., an animal model for a pain or pain-related disorder.

[1794] The invention also features a method of diagnosing a disorder, e.g., a disorder disclosed herein, in a subject. The method includes evaluating the expression or activity of a 46980 nucleic acid or a 46980 polypeptide, such that, a difference in the level of 46980 nucleic acid or 46980 polypeptide relative to a normal subject or a cohort of normal subjects is indicative of the disorder.

[1795] In a preferred embodiment, the disorder is a neuronal disorder. In a preferred embodiment, the disorder is a pain or a pain related disorder. In another preferred embodiment, the disorder is a neurodegenerative disorder. In still another preferred embodiment, the disorder is a neuronal connectivity-related disorder or a developmental disorder of the nervous system.

[1796] In a preferred embodiment, the disorder is infertility or aberrant spermatid cell activity.

[1797] In a preferred embodiment, the disorder is a cancer, e.g., prostate cancer, brain cancer, neurofibramatosis, or testicular cancer.

[1798] In a preferred embodiment, the subject is a human.

[1799] In a preferred embodiment, the evaluating step occurs in vitro or ex vivo. For example, a sample, e.g., a blood sample, is obtained from the subject.

[1800] In a preferred embodiment, the evaluating step occurs in vivo. For example, by administering to the subject a detectably labeled agent that interacts with the 46980 nucleic acid or polypeptide, such that a signal is generated relative to the level of activity or expression of the 46980 nucleic acid or polypeptide.

[1801] The invention also provides assays for determining the activity of or the presence or absence of 46980 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis.

[1802] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 46980 polypeptide or nucleic acid molecule, including for disease diagnosis.

[1803] In yet another aspect, the invention features a method for identifying an agent, e.g., a compound, which modulates the activity of a 46980 polypeptide, e.g., a 46980 polypeptide as described herein, or the expression of a 46980 nucleic acid, e.g., a 46980 nucleic acid as described herein, including contacting the 46980 polypeptide or nucleic acid with a test agent (e.g., a test compound); and determining the effect of the test compound on the activity of the 46980 polypeptide or nucleic acid to thereby identify a compound which modulates the activity of the 46980 polypeptide or nucleic acid.

[1804] In a preferred embodiment, the activity of the 46980 polypeptide is a protein-interaction activity, e.g., a neurexin-binding activity, e.g., a β-neurexin binding activity.

[1805] In a preferred embodiment, the activity of the 46980 polypeptide is modulation of pain response.

[1806] In preferred embodiments, the agent is a peptide (or polypeptide), a small molecule, e.g., a member of a combinatorial library, or an antibody, or any combination thereof. The antibody, peptide, or small molecule can binding to a 46980 surface region that interfaces with a 46980 ligand, e.g., a neurexin, e.g., a β-neurexin.

[1807] In additional preferred embodiments, the agent is an antisense, a ribozyme, or a triple helix molecule, or an 46980 nucleic acid, or any combination thereof.

[1808] In another aspect, the invention features a method of inhibiting the function or inducing the killing of a 46980-expressing cell, e.g., a brain cell, a neuron, a spinal cord cell, a prostate cell, a testes cell, or a spermatid. The method includes contacting the 46980-expressing cell with a compound that binds to a 46980 polypeptide in an amount effective to inhibit the function of the cell or induce killing of the cell. In a preferred embodiment, the compound is a small molecule. In another preferred embodiment, the compound is a polypeptide, e.g., a polypeptide that binds the 46980 extracellular domain, e.g., an antibody or a soluble fragment of a neurexin. The polypeptide can be coupled to a cytotoxin, a cytotoxic agent and a radioactive metal ion.

[1809] In another embodiment, the compound disrupts a 46980 interaction with a 46980 ligand.

[1810] In still another aspect, the invention features a method of labelling a 46980-expressing cell, e.g., a brain cell, a neuron, a spinal cord cell, a prostate cell, a testes cell, or a spermatid. The method includes contacting the 46980-expressing cell with a compound that binds to a 46980 polypeptide. In a preferred embodiment, the compound is a small molecule that includes a label or a moiety recognizable by a labeling entity. In another preferred embodiment, the compound is a polypeptide, e.g., a polypeptide that binds the 46980 extracellular domain, e.g., an antibody or a soluble fragment of a neurexin. The polypeptide includes a label or a moiety recognizable by a labeling entity.

[1811] In a preferred embodiment, the 46980-expressing cell is in a subject mammal. The compound is labeled, e.g., with a label that is detectable by a scan of the subject, e.g., by magnetic resonance imaging. The compound is administered to the subject, e.g., by parenteral injection or by an epidural injection.

[1812] In another preferred embodiment, the 46980-expressing cell is in a sample, e.g., a biopsy or a tissue section. The sample can be fixed or live.

[1813] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 46980 polypeptide or nucleic acid molecule, including for disease diagnosis.

[1814] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 46980 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 46980 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 46980 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[1815] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 46980

[1816] The human 46980 sequence (see SEQ ID NO:27, as recited in Example 19), which is approximately 3502 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 2451 nucleotides, including the termination codon. The coding sequence encodes a 816 amino acid protein (see SEQ ID NO:28, as recited in Example 19). The human 46980 protein of SEQ ID NO:28 and FIG. 16 includes an amino-terminal hydrophobic amino acid sequence, consistent with a signal sequence, of about 43 amino acids (from amino acid 1 to about amino acid 43 of SEQ ID NO:28), which upon cleavage results in the production of a mature protein form.)). This mature protein form is approximately 773 amino acid residues in length (from about amino acid 44 to amino acid 816 of SEQ ID NO:28).

[1817] Human 46980 contains the following regions or other structural features:

[1818] a predicted carboxylesterase domain (PFAM Accession PF00135) located at about amino acid 25 to 590 of SEQ ID NO:28 (which includes a carboxylesterase type B signature 2 domain located at about amino acids 144 to 154 of SEQ ID NO:28);

[1819] a predicted transmembrane region located at about amino acids 675 to 696 of SEQ ID NO:28;

[1820] a predicted N-terminal extracellular domain located at about amino acids 1 to 674 of SEQ ID NO:28;

[1821] a predicted C-terminal intracellular domain located at about amino acids 697 to 816 of SEQ ID NO:28;

[1822] two predicted N-glycosylation sites (PS00001) located from about amino acids 102 to 105, and 511 to 514 of SEQ ID NO:28;

[1823] a glycosaminoglycan attachment site (PS00002) located at about amino acids 253 to 256 of SEQ ID NO:28;

[1824] five predicted protein kinase C phosphorylation sites (PS00005) located at about amino acids 707 to 709, 751 to 753, 763 to 765, 794 to 796, and 813 to 815, of SEQ ID NO:28;

[1825] two predicted casein kinase II phosphorylation sites (PS00006) located at about amino acids 717 to 720 and 767 to 770, of SEQ ID NO:28; and

[1826] ten predicted N-myristylation sites (PS00008) located at about amino acids 75 to 80, 99 to 104, 187 to 192, 239 to 244, 252 to 257, 281 to 286, 370 to 375, 382 to 387, 389 to 394, and 800 to 805, of SEQ ID NO:28.

[1827] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[1828] A plasmid containing the nucleotide sequence encoding human 46980 (clone “Fbh46980FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. § 112.

[1829] The 46980 protein contains a significant number of structural characteristics in common with members of the carboxylesterase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics. Carboxylesterase family members are known to act on carboxylic esters. Based on the differential patters of inhibition by organophosphates, carboxylesterases have been classified into three categories (A, B and C) (Myers, M. et al. (1988) Mol. Biol. Evol. 5(2):113-119). 46980 proteins of the invention include a carboxylesterase type B signature 2 domain located at about amino acids 144 to 154 of SEQ ID NO:28, which suggests that the 46980 proteins belong to the carboxylesterase type B family.

[1830] 46980 proteins of the invention are homologous to the rat neuroligin protein, in particular, the rat neuroligin 3 protein (FIG. 17). Thus, the protein of the invention are members of the neuroligin subfamily of carboxylesterase type B proteins. As used herein, the term “neuroligin” refers to cell surface molecules composed of five domains: an N-terminal cleaved signal sequence, a large extracellular domain homologous to esterases, a linker domain between the transmembrane region and the esterase homology domain, a single transmembrane region, and a cytoplasmic tail (Ichtchenko, K. et al. (1996) J. Biol. Chem. 271(5):2676-2682). Neuroligins can lack a catalytic serine amino acid at a conserved position in the carboxylesterase active site.

[1831] Members of the neuroligins are typically expressed at high levels in the nervous system, e.g., the spinal cord and the brain, e.g., primarily in neurons. Preferably, neuroligins are capable of mediating cell adhesion events associated with development and/or maintenance, e.g., neural events such as synaptogenesis, recruitment of receptors, channels, and signal transduction molecules at synaptic sites (e.g., at excitatory synapses) (Song et al. (1999) Proc. Natl. Acad. Sci. 96(3):1100-5. Preferably, neuroligins are capable of interacting with a cell surface protein, e.g., a neurexin (e.g., a β-neurexins). Preferably, neuroligin-β-neurexin interactions mediate cell adhesion events, e.g., neuron-neuron, or neuron-glia cell adhesion events.

[1832] A 46980 polypeptide can include at least one “carboxylesterase domain” or at least one region homologous with a “carboxylesterase domain”. A 46980 can optionally further include at least one transmembrane domain; at least one extracellular domain; and at least one intracellular domain. A 46980 can optionally further include at least one, two, three, preferably four, N-glycosylation sites; at least one, preferably two, cAMP/cGMP phosphorylation sites; at least one, two, three, four, or five protein kinase C sites; at least one or two casein kinase II sites; at least one, two, three, four, five, six, seven, eight, preferably ten N-myristylation sites; and at least one glycosaminoglycan attachment site.

[1833] As used herein, the term “carboxylesterase domain” refers to a protein domain which is includes a carboxylesterase type B signature 2 domain and that is at least 300 amino acid, and has a bit score for alignment to the HMM profile for the Pfam carboxylesterase domain (PF001350) at the time of filing of at least 100.

[1834] Preferably, the carboxylesterase type B signature 2 domain is about 5 to 20 amino acids, more preferably 8-15, most preferably 11 amino acids and includes the sequence: E-D-X(0,1)-C-L-Y (SEQ ID NO:32). Most preferably, the carboxylesterase type B signature 2 domain has the amino acid sequence: EDCLYNIYVP located at about amino acids 144 to 154 of SEQ ID NO:28.

[1835] Preferably, the carboxylesterase domain has an amino acid sequence of about 400 to about 650 amino acid residues and having a bit score for the alignment of the sequence to the carboxylesterase domain (HMM) of at least 100. Preferably, a carboxylesterase domain includes at least about 450 to about 600 amino acids, more preferably about 500 to about 575 amino acid residues, about 550 to 570, or about 565 amino acids and has a bit score for the alignment of the sequence to the carboxylesterase domain (HMM) of at least 200, preferably 300, more preferably 400 or greater. The carboxylesterase domain (HMM) has been assigned the PFAM Accession (PF00135) (http://genome.wustl.edu/Pfam/html). An alignment of the carboxylesterase domain (from about amino acids 25 to about 590 of SEQ ID NO:28) of human 46980 with a consensus amino acid sequence derived from a hidden Markov model (PFAM) at the time of filing is depicted in FIG. 16.

[1836] In a preferred embodiment, 46980 polypeptide or protein has a “carboxylesterase domain” or a region which includes at least about 400 to about 650 amino acids, preferably 450 to about 600 amino acids, more preferably about 500 to about 570 amino acid residues, about 550 to 570, or about 565 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “carboxylesterase domain,” e.g., the carboxylesterase domain of human 46980 (e.g., residues 25 to 590 of SEQ ID NO:28).

[1837] To identify the presence of a “carboxylesterase” domain in a 46980 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the PFAM HMM database resulting in the identification of a “carboxylesterase domain” in the amino acid sequence of human 46980 at about residues 25 to about 590 of SEQ ID NO:28 (see FIGS. 15 and 17).

[1838] In one embodiment, a 46980 protein includes at least one, preferably two, transmembrane domains. As used herein, the term “transmembrane domain” includes an amino acid sequence of about 15 amino acid residues in length that spans a phospholipid membrane. More preferably, a transmembrane domain includes about at least 16, 18, 20, 21, 22, 25, 30, 35 or 40 amino acid residues and spans a phospholipid membrane. Transmembrane domains are rich in hydrophobic residues, and typically have an α-helical structure. In a preferred embodiment, at least 50%, 60%, 70%, 80%, 90%, 95% or more of the amino acids of a transmembrane domain are hydrophobic, e.g., leucines, isoleucines, tyrosines, methionines, phenylalanines, or tryptophans. Transmembrane domains are described in, for example, Zagotta W. N. et al, (1996) Annual Rev. Neuronsci. 19: 235-63, the contents of which are incorporated herein by reference.

[1839] In a preferred embodiment, a 46980 polypeptide or protein has at least one transmembrane domain or a region which includes at least 16, 18, 20, 21, 22, 25, 30, 35 or 40 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “transmembrane domain,” e.g., at least one transmembrane domain of human 46980 (e.g., from about amino acid residues 675 to about 696 of SEQ ID NO:28).

[1840] A 46980 protein further includes a predicted N-terminal extracellular domain located at about amino acids 1-674 (or 44-674 of the mature protein) of SEQ ID NO:28. As used herein, an “N-terminal extracellular domain” includes an amino acid sequence about 1-800, preferably about 200-700, and even more preferably about 300-680 or 674, amino acid residues in length and is located outside of a cell, extracellularly, or internally in a membrane compartment that is topologically equivalent to the extracellular milieu. The C-terminal amino acid residue of a “N-terminal extracellular domain” is adjacent to an N-terminal amino acid residue of a transmembrane domain in a naturally-occurring 46980 or 46980-like protein. For example, an N-terminal cytoplasmic domain is located at about amino acid residues Ito 674 of SEQ ID NO:28.

[1841] In a preferred embodiment 46980 polypeptide or protein has an “N-terminal extracellular domain” or a region which includes at least about 1-800, preferably about 200-700, and even more preferably about 300-680 or 674 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with an “N-terminal extracellular domain,” e.g., the N-terminal extracellular domain of human 46980 (e.g., residues 1-674 of SEQ ID NO:28). Preferably, the N-terminal extracellular domain is capable of interacting (e.g., binding to) with an extracellular signal (e.g., a neurexin) and/or modulating cell adhesion.

[1842] In another embodiment, a 46980 protein includes a “C-terminal cytoplasmic domain”, also referred to herein as a C-terminal cytoplasmic tail, in the sequence of the protein. As used herein, a “C-terminal cytoplasmic domain” includes an amino acid sequence having a length of at least about 50 to 200, preferably 100 to 150, and more preferably 109 amino acid residues and is located within a cell or within the cytoplasm of a cell. Accordingly, the N-terminal amino acid residue of a “C-terminal cytoplasmic domain” is adjacent to a C-terminal amino acid residue of a transmembrane domain in a naturally-occurring 46980 or 46980-like protein. For example, a C-terminal cytoplasmic domain is found at about amino acid residues 697-816 of SEQ ID NO:28.

[1843] In a preferred embodiment, a 46980 polypeptide or protein has a C-terminal cytoplasmic domain or a region which includes at least about 50 to 200, preferably 100 to 150, and more preferably 109 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with an “C-terminal cytoplasmic domain,” e.g., the C-terminal cytoplasmic domain of human 46980 (e.g., residues 697-916 of SEQ ID NO:28).

[1844] A 46980 molecule can further include a signal sequence. As used herein, a “signal sequence” refers to a peptide of about 20-50 amino acid residues in length which occurs at the N-terminus of secretory and integral membrane proteins and which contains a majority of hydrophobic amino acid residues. For example, a signal sequence contains at least about 30-48 amino acid residues, preferably about 40-45 amino acid residues, more preferably about 43 amino acid residues, and has at least about 40-70%, preferably about 50-65%, and more preferably about 55-60% hydrophobic amino acid residues (e.g., alanine, valine, leucine, isoleucine, phenylalanine, tyrosine, tryptophan, or proline). Such a “signal sequence”, also referred to in the art as a “signal peptide”, serves to direct a protein containing such a sequence to a lipid bilayer. For example, in one embodiment, a 46980 protein contains a signal sequence of about amino acids 1-43 of SEQ ID NO:28. The “signal sequence” can be cleaved during processing of the mature protein. The mature 46980 protein can correspond to about amino acids 44 to 816 of SEQ ID NO:28.

[1845] As the 46980 polypeptides of the invention may modulate 46980-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 46980-mediated or -related disorders, as described below.

[1846] As used herein, a “46980 activity”, “biological activity of 46980” or “functional activity of 46980”, refers to an activity exerted by a 46980 protein, polypeptide or nucleic acid molecule. For example, a 46980 activity can be an activity exerted by 46980 in a physiological milieu on, e.g., a 46980-responsive cell or on a 46980 substrate, e.g., a protein substrate. A 46980 activity can be determined in vivo or in vitro. In one embodiment, a 46980 activity is a direct activity, such as an association with a 46980 target molecule. A “target molecule” or “binding partner” is a molecule with which a 46980 protein binds or interacts in nature. In an exemplary embodiment, a 46980 binding partner is a neurexin, e.g., a β-neurexin.

[1847] A 46980 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 46980 protein with a 46980 receptor. The features of the 46980 molecules of the present invention can provide similar biological activities as neuroligin family members. For example, the 46980 proteins of the present invention is predicted to have one or more of the following activities: (1) ability to catalyze the hydrolysis of carboxylic esters; (2) ability to mediate cell-cell (e.g., neuron-neuron, or neuron-glia) recognition events, adhesion or attachment (3) ability to interact with a cell surface protein (e.g., neurexin) or an extracellular component; (4) ability to modulate cell migration, (5) ability to modulate patterning, (6) ability to modulate proliferation, and/or differentiation, of a cell (e.g., a neural cell); (7) ability to modulate embryonic development and differentiation; (8) ability to modulate morphogenesis; (9) ability to modulate tissue maintenance; or (10) ability to modulate neural development, e.g., axonal growth, synaptogenesis, neurite outgrowth, membrane excitability and/or guidance.

[1848] Based on their localization and their structural features, the 46980 molecules of the present invention can have similar biological activities as carboxylesterase family members, in particular neuroligin proteins. Thus, the 46980 molecules can act as novel diagnostic targets and therapeutic agents for controlling a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder, disorders, as well as male-fertility disorders, disorders of the brain, and cell proliferative disorders.

[1849] 46980 is highly expressed in the central and peripheral nervous system (see Example 20, below). More specifically, high levels of 46980 mRNA expression were found in human brain, spinal cord and dorsal root ganglia (see Table 5, below) as demonstrated by TaqMan analysis. 46980 molecules can function in a neuronal cell involved in pain response or in a neuronal developmental pathway.

[1850] Animal models of pain response include, but are not limited to, axotomy, the cutting or severing of an axon; chronic constriction injury (CCI), a model of neuropathic pain which involves ligation of the sciatic nerve in rodents, e.g., rats; or intraplantar Freund's adjuvant injection as a model of arthritic pain. Other animal models of pain response are described in, e.g., ILAR Journal (1999) Volume 40, Number 3 (entire issue). e. The 46980 molecules of the invention may be directly or indirectly involved include pain, pain syndromes, and inflammatory disorders, including inflammatory pain. Modulators of 46980 molecules may have analgesic effects, e.g., as evaluated by analgesic tests on animals, e.g., the hot plate test, tail flick test, writhing test, paw pressure test, all electric stimulation test, tail withdrawal test, or formalin test (Roques et al. (1995) Methods in Enzymology 248:263-283).

[1851] As the 46980 molecules of the invention may modulate 46980-mediated activities, they may be useful for developing novel diagnostic and therapeutic agents for 46980-mediated or related disorders. For example, the 46980 molecules can act as novel diagnostic targets and therapeutic agents controlling pain, pain disorders, and other neuronal disorders.

[1852] Examples of pain conditions include, but are not limited to, pain elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia; pain associated with musculoskeletal disorders, e.g., joint pain, or arthritis; tooth pain; headaches, e.g., migrane; pain associated with surgery; pain related to inflammation, e.g., irritable bowel syndrome; chest pain; or hyperalgesia, e.g., excessive sensitivity to pain (described in, for example, Fields (1987) Pain, New York: McGraw-Hill). Other examples of pain disorders or pain syndromes include, but are not limited to, complex regional pain syndrome (CRPS), reflex sympathetic dystrophy (RSD), causalgia, neuralgia, central pain and dysesthesia syndrome, carotidynia, neurogenic pain, refractory cervicobrachial pain syndrome, myofascial pain syndrome, craniomandibular pain dysfunction syndrome, chronic idiopathic pain syndrome, Costen's pain-dysfunction, acute chest pain syndrome, nonulcer dyspepsia, interstitial cystitis, gynecologic pain syndrome, patellofemoral pain syndrome, anterior knee pain syndrome, recurrent abdominal pain in children, colic, low back pain syndrome, neuropathic pain, phantom pain from amputation, phantom tooth pain, or pain asymbolia (the inability to feel pain). Other examples of pain conditions include pain induced by parturition, or post partum pain.

[1853] Agents that modulate 46980 polypeptide or nucleic acid activity or expression can be used to treat pain elicited by any medical condition. A subject receiving the treatment can be additionally treated with a second agent, e.g., an anti-inflammatory agent, an antibiotic, or a chemotherapeutic agent, to further ameliorate the condition.

[1854] The 46980 molecules can also act as novel diagnostic targets and therapeutic agents controlling pain caused by other disorders, e.g., cancer, e.g., prostate cancer.

[1855] As the 46980 molecules are highly expressed in brain, the 46980 molecules can act as diagnostic markers and therapeutic agents or targets in disorders of the brain. Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B₁) deficiency and vitamin B₁₂ deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[1856] As 46980 is highly expressed in human testis, 46980 molecules can have a role in, e,g, fertility or spermatid development, and disorders of the testes (see Table 5, below). Human 46980 molecules can also act as novel diagnostic targets and therapeutic agents controlling sperm formation or other processes related to fertility, e.g., spermatogenesis or fertilization. Disorders involving the testis and epididymis include, but are not limited to, congenital anomalies such as cryptorchidism, regressive changes such as atrophy, inflammations such as nonspecific epididymitis and orchitis, granulomatous (autoimmune) orchitis, and specific inflammations including, but not limited to, gonorrhea, mumps, tuberculosis, and syphilis, vascular disturbances including torsion, testicular tumors including germ cell tumors that include, but are not limited to, seminoma, spermatocytic seminoma, embryonal carcinoma, yolk sac tumor choriocarcinoma, teratoma, and mixed tumors, tumore of sex cord-gonadal stroma including, but not limited to, Leydig (interstitial) cell tumors and sertoli cell tumors (androblastoma), and testicular lymphoma, and miscellaneous lesions of tunica vaginalis.

[1857] In addition, 46980 molecules may be useful for disorders of the prostate. A prostate disorder refers to an abnormal condition occurring in the male pelvic region characterized by, e.g., male sexual dysfunction and/or urinary symptoms. This disorder may be manifested in the form of genitourinary inflammation (e.g., inflammation of smooth muscle cells) as in several common diseases of the http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h5http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h7prostate including prostatitis, benign prostatic hyperplasia and cancer, e.g., adenocarcinoma or carcinoma, of the http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h6http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h8prostate.

[1858] The 46980 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:28 thereof are collectively referred to as “polypeptides or proteins of the invention” or “46980 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “46980 nucleic acids.” 46980 molecules refer to 46980 nucleic acids, polypeptides, and antibodies.

[1859] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[1860] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[1861] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[1862] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO:27 or SEQ ID NO:29, corresponds to a naturally-occurring nucleic acid molecule.

[1863] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein. As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding a 46980 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 46980 protein or derivative thereof.

[1864] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 46980 protein is at least 10% pure. In a preferred embodiment, the preparation of 46980 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-46980 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-46980 chemicals. When the 46980 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[1865] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 46980 without abolishing or substantially altering a 46980 activity. Preferably the alteration does not substantially alter the 46980 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 46980, results in abolishing a 46980 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 46980 are predicted to be particularly unamenable to alteration.

[1866] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 46980 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 46980 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 46980 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:27 or SEQ ID NO:29, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[1867] As used herein, a “biologically active portion” of a 46980 protein includes a fragment of a 46980 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 46980 molecule and a non-46980 molecule or between a first 46980 molecule and a second 46980 molecule (e.g., a dimerization interaction). Biologically active portions of a 46980 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 46980 protein, e.g., the amino acid sequence shown in SEQ ID NO:28, which include less amino acids than the full length 46980 proteins, and exhibit at least one activity of a 46980 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 46980 protein, e.g., a protein binding or recognition function, e.g., neurexin binding or recognition. A biologically active portion of a 46980 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 46980 protein can be used as targets for developing agents which modulate a 46980 mediated activity, e.g., a protein binding or recognition function, e.g., neurexin binding or recognition.

[1868] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[1869] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[1870] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[1871] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[1872] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[1873] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 46980 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 46980 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[1874] Particularly preferred 46980 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:28. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:28 are termed substantially identical.

[1875] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:27 or 29 are termed substantially identical.

[1876] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[1877] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[1878] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[1879] Various aspects of the invention are described in further detail below.

[1880] Isolated Nucleic Acid Molecules of 46980

[1881] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 46980 polypeptide described herein, e.g., a full-length 46980 protein or a fragment thereof, e.g., a biologically active portion of 46980 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 46980 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[1882] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:27, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 46980 protein (i.e., “the coding region” of SEQ ID NO:27, as shown in SEQ ID NO:29), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:27 (e.g., SEQ ID NO:29) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein from about amino acid 25 to 590.

[1883] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:27 or SEQ ID NO:29, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:27 or SEQ ID NO:29, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:27 or 29, thereby forming a stable duplex.

[1884] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about: 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:27 or SEQ ID NO:29, or a portion, preferably of the same length, of any of these nucleotide sequences.

[1885] 46980 Nucleic Acid Fragments

[1886] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:27 or 29. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of a 46980 protein, e.g., an immunogenic or biologically active portion of a 46980 protein. A fragment can comprise those nucleotides of SEQ ID NO:27, which encode a carboxylesterase domain of human 46980. The nucleotide sequence determined from the cloning of the 46980 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 46980 family members, or fragments thereof, as well as 46980 homologues, or fragments thereof, from other species.

[1887] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 100, 200, 400, 450, 500, 550, 600, 720, 760, 800 amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[1888] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 46980 nucleic acid fragment can include a sequence corresponding to a carboxylesterase domain, an extracellular domain, or an intracellular domain.

[1889] 46980 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:27 or SEQ ID NO:29, or of a naturally occurring allelic variant or mutant of SEQ ID NO:27 or SEQ ID NO:29.

[1890] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[1891] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes: a carboxylesterase domain from about amino acid 25 to 590 of SEQ ID NO:28.

[1892] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 46980 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a carboxylesterase domain from about amino acid 25 to 590 of SEQ ID NO:28.

[1893] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[1894] A nucleic acid fragment encoding a “biologically active portion of a 46980 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:27 or 29, which encodes a polypeptide having a 46980 biological activity (e.g., the biological activities of the 46980 proteins are described herein), expressing the encoded portion of the 46980 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 46980 protein. For example, a nucleic acid fragment encoding a biologically active portion of 46980 includes a carboxylesterase domain, e.g., amino acid residues about 25 to 590 of SEQ ID NO:28. A nucleic acid fragment encoding a biologically active portion of a 46980 polypeptide, may comprise a nucleotide sequence which is greater than 300, 400, 500 or more nucleotides in length.

[1895] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 800, 1000, 1200, 1500, 2000, 2200, 2400, 2800, 3000 or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:27, or SEQ ID NO:29. A nucleic acid fragment can include at least a contiguous nucleic acid sequence from a region selected from the group consisting of about nucleotides 1-243, 1-368, 244-368, 1822-2100, 465-517, 1405-1422, 2544-2700, 780-900, or 493-600 of SEQ ID NO:27. A nucleic acid fragment can include at least a contiguous nucleic acid sequence that encodes a peptide having an amino acid sequence of a region selected from the group consisting of about amino acids 1-150, 139-156, 450-460, 300-456, 139-200, 1-175, 614-700, 300-375, 430-480, or 520-600 of SEQ ID NO:28.

[1896] In another preferred embodiment, the nucleic acid fragment encodes a nucleic acid sequence that is at least 650 amino acid in length, e.g., at least 675, 700, 725, 750, or 800 amino acids in length.

[1897] In still another embodiment, the nucleic acid fragment is other than the nucleic acid sequence of BAA86574 or BAA76795.

[1898] 46980 Nucleic Acid Variants

[1899] The invention further encompasses-nucleic-acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:27 or SEQ ID NO:29. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 46980 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:28. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[1900] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. For example, the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[1901] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[1902] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:27 or 29, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[1903] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO:28 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO:28 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 46980 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 46980 gene.

[1904] Preferred variants include those that are correlated with a protein binding or recognition function, e.g., neurexin binding or recognition.

[1905] Allelic variants of 46980, e.g., human 46980, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 46980 protein within a population that maintain the ability to bind neurexin. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:28, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 46980, e.g., human 46980, protein within a population that do not have the ability to neurexin. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:28, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[1906] Moreover, nucleic acid molecules encoding other 46980 family members and, thus, which have a nucleotide sequence which differs from the 46980 sequences of SEQ ID NO:27 or SEQ ID NO:29 are intended to be within the scope of the invention.

[1907] Antisense Nucleic Acid Molecules, Ribozymes and Modified 46980 Nucleic Acid Molecules

[1908] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 46980. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 46980 coding strand, or to only a portion thereof (e.g., the coding region of human 46980 corresponding to SEQ ID NO:29). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 46980 (e.g., the 5′ and 3′untranslated regions).

[1909] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 46980 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 46980 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 46980 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[1910] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[1911] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 46980 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered-to-cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[1912] In yet another embodiment, the antisense nucleic acid molecule of the invention is an α-anomeric nucleic acid molecule. An α-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual β-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[1913] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 46980-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 46980 cDNA disclosed herein (i.e., SEQ ID NO:27 or SEQ ID NO:29), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 46980-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 46980 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[1914] 46980 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 46980 (e.g., the 46980 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 46980 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[1915] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or colorimetric.

[1916] A 46980 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[1917] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[1918] PNAs of 46980 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 46980 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[1919] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[1920] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 46980 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 46980 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[1921] Isolated 46980 Polypeptides

[1922] In another aspect, the invention features, an isolated 46980 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-46980 antibodies. 46980 protein can be isolated from cells or tissue sources using standard protein purification techniques. 46980 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[1923] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[1924] In a preferred embodiment, a 46980 polypeptide has one or more of the following characteristics:

[1925] (i) it has the ability to recognize and/or bind a ligand, e.g., a neurexin, e.g., a β-neurexin;

[1926] (ii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of a 46980 polypeptide, e.g., the polypeptide of SEQ ID NO:28;

[1927] (iii) it has an overall sequence similarity of at least 60%, more preferably at least 70, 80, 90, or 95%, with a polypeptide a of SEQ ID NO:28;

[1928] (iv) it can be found in the central and/or peripheral nervous system;

[1929] (v) it has a carboxylesterase domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues about 25 to 590 of SEQ ID NO:28;

[1930] (vi) it can include conserved cysteine that form disulfide bonds, notably cysteines at about amino acid 110, 146, 306, and 317 of SEQ ID NO:28; and

[1931] (vii) it can include an altered carboxylesterase active site that lacks a conserved serine at about amino acid 254, but retains a conserved acidic residue at about amino acid 375 of SEQ ID NO:28, and a conserved histidine at about amino acid 489 of SEQ ID NO:28.

[1932] In a preferred embodiment the 46980 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:28. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:28 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:28. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the carboxylesterase domain. In another preferred embodiment one or more differences are in the carboxylesterase domain.

[1933] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 46980 proteins differ in amino acid sequence from SEQ ID NO:28, yet retain biological activity.

[1934] In one embodiment, the protein includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:28.

[1935] A 46980 protein or fragment is provided which varies from the sequence of SEQ ID NO:28 in regions defined by amino acids about 696 to 891 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:28 in regions defined by amino acids about 41 to 674. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[1936] In one embodiment, a biologically active portion of a 46980 protein includes a carboxylesterase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 46980 protein.

[1937] In a preferred embodiment, the 46980 protein has an amino acid sequence shown in SEQ ID NO:28. In other embodiments, the 46980 protein is substantially identical to SEQ ID NO:28. In yet another embodiment, the 46980 protein is substantially identical to SEQ ID NO:28 and retains the functional activity of the protein of SEQ ID NO:28, as described in detail in the subsections above.

[1938] In another preferred embodiment, the 46980 polypeptide can have an amino acid sequence other than that encoded by BAA86574 (KIAA1260) and/or other than that encoded by BAA76795 (KIAA0951).

[1939] 46980 Chimeric or Fusion Proteins

[1940] In another aspect, the invention provides 46980 chimeric or fusion proteins. As used herein, a 46980 “chimeric protein” or “fusion protein” includes a 46980 polypeptide linked to a non-46980 polypeptide. A “non-46980 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 46980 protein, e.g., a protein which is different from the 46980 protein and which is derived from the same or a different organism. The 46980 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 46980 amino acid sequence. In a preferred embodiment, a 46980 fusion protein includes at least one (or two) biologically active portion of a 46980 protein. The non-46980 polypeptide can be fused to the N-terminus or C-terminus of the 46980 polypeptide.

[1941] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-46980 fusion protein in which the 46980 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 46980. Alternatively, the fusion protein can be a 46980 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 46980 can be increased through use of a heterologous signal sequence.

[1942] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[1943] The 46980 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 46980 fusion proteins can be used to affect the bioavailability of a 46980 substrate. 46980 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 46980 protein; (ii) mis-regulation of the 46980 gene; and (iii) aberrant post-translational modification of a 46980 protein.

[1944] Moreover, the 46980-fusion proteins of the invention can be used as immunogens to produce anti-46980 antibodies in a subject, to purify 46980 ligands and in screening assays to identify molecules which inhibit the interaction of 46980 with a 46980 substrate.

[1945] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 46980-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 46980 protein.

[1946] Variants of 46980 Proteins

[1947] In another aspect, the invention also features a variant of a 46980 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 46980 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 46980 protein. An agonist of the 46980 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 46980 protein. An antagonist of a 46980 protein can inhibit one or more of the activities of the naturally occurring form of the 46980 protein by, for example, competitively modulating a 46980-mediated activity of a 46980 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 46980 protein.

[1948] Variants of a 46980 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 46980 protein for agonist or antagonist activity.

[1949] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 46980 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 46980 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[1950] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 46980 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 46980 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993)Protein Engineering 6:327-331).

[1951] Cell based assays can be exploited to analyze a variegated 46980 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to a 46980 polypeptide in a substrate-dependent manner. The transfected cells are then contacted with a 46980 polypeptide (e.g. using a soluable form of the extracellular domain of 46980 or a cell expressing a transmembrane form of 46980) and the effect of the expression of the mutant on signaling by the 46980 substrate can be detected, e.g., by measuring a protein binding or recognition function, e.g., neurexin binding or recognition. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 46980 substrate, and the individual clones further characterized.

[1952] In another aspect, the invention features a method of making a 46980 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 46980 polypeptide, e.g., a naturally occurring 46980 polypeptide. The method includes: altering the sequence of a 46980 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[1953] In another aspect, the invention features a method of making a fragment or analog of a 46980 polypeptide a biological activity of a naturally occurring 46980 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 46980 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[1954] Anti-46980 Antibodies

[1955] In another aspect, the invention provides an anti-46980 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[1956] The anti-46980 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[1957] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[1958] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 46980 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-46980 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)₂ fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[1959] The anti-46980 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[1960] Phage display and combinatorial methods for generating anti-46980 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[1961] In one embodiment, the anti-46980 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[1962] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[1963] An anti-46980 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[1964] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[1965] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 46980 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[1966] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[1967] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 46980 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[1968] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[1969] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089; e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[1970] In preferred embodiments an antibody can be made by immunizing with purified 46980 antigen, or a fragment thereof, e.g., a fragment described herein, membrane associated antigen, tissue, e.g., crude tissue preparations, whole cells, preferably living cells, lysed cells, or cell fractions, e.g., membrane fractions.

[1971] A full-length 46980 protein or, antigenic peptide fragment of 46980 can be used as an immunogen or can be used to identify anti-46980 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 46980 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:28 and encompasses an epitope of 46980. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[1972] Fragments of 46980 which include residues about 162 to 174, about 422 to 454, or about 696 to 726 can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody, antibodies against hydrophilic regions of the 46980 protein. Similarly, fragments of 46980 which include residues about 1 to 41, about 675 to 696, or about 521 to 535 can be used to make an antibody against a hydrophobic region of the 46980 protein; fragments of 46980 which include residues about 1 to 674, about 44 to 674, or about 25 to 590 of SEQ ID NO:28 can be used to make an antibody against an extracellular region of the 46980 protein; a fragment of 46980 which includes residues about 697 to 816 can be used to make an antibody against an intracellular region of the 46980 protein; a fragment of 46980 which includes residues about 25 to 590 of SEQ ID NO:28 can be used to make an antibody against the carboxylesterase region of the 46980 protein.

[1973] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[1974] Antibodies which bind only native 46980 protein, only denatured or otherwise non-native 46980 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 46980 protein.

[1975] Preferred epitopes encompassed by the antigenic peptide are regions of 46980 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 46980 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 46980 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[1976] In a preferred embodiment the antibody can bind to the extracellular portion of the 46980 protein, e.g., it can bind to a whole cell which expresses the 46980 protein. In another embodiment, the antibody binds an intracellular portion of the 46980 protein. In preferred embodiments antibodies can bind one or more of purified antigen, membrane associated antigen, tissue, e.g., tissue sections, whole cells, preferably living cells, lysed cells, cell fractions, e.g., membrane fractions.

[1977] The anti-46980 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 46980 protein.

[1978] In a preferred embodiment the antibody has: effector function; and can fix complement. In other embodiments the antibody does not; recruit effector cells; or fix complement.

[1979] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example., it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[1980] In a preferred embodiment, an anti-46980 antibody alters (e.g., increases or decreases) the a protein binding or recognition function, e.g., neurexin binding or recognition, activity of a 46980 polypeptide. For example, the antibody can bind at or in proximity to the active site, e.g., to an epitope that includes a residue located from about 245 to 262 of SEQ ID NO:28.

[1981] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[1982] An anti-46980 antibody (e.g., monoclonal antibody) can be used to isolate 46980 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-46980 antibody can be used to detect 46980 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-46980 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or ³H.

[1983] The invention also includes a nucleic acid which encodes an anti-46980 antibody, e.g., an anti-46980 antibody described herein. Also included are vectors which include the nucleic acid and cells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[1984] The invention also includes cell lines, e.g., hybridomas, which make an anti-46980 antibody, e.g., and antibody described herein, and method of using said cells to make a 46980 antibody.

[1985] The invention also includes other 46980-ligands, e.g., a neurexin, preferably a β-neurexin. The neurexin can be modified by routine protein engineering techniques to obtain a soluable neurexin domain which binds to a 46980 polypeptide.

[1986] 46980 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[1987] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[1988] A vector can include a 46980 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 46980 proteins, mutant forms of 46980 proteins, fusion proteins, and the like).

[1989] The recombinant expression vectors of the invention can be designed for expression of 46980 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[1990] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[1991] Purified fusion proteins can be used in 46980 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 46980 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[1992] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[1993] The 46980 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[1994] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[1995] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[1996] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the α-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[1997] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[1998] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 46980 nucleic acid molecule within a recombinant expression vector or a 46980 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[1999] A host cell can be any prokaryotic or eukaryotic cell. For example, a 46980 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells (African green monkey kidney cells CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182)). Other suitable host cells are known to those skilled in the art.

[2000] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[2001] A host cell of the invention can be used to produce (i.e., express) a 46980 protein. Accordingly, the invention further provides methods for producing a 46980 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 46980 protein has been introduced) in a suitable medium such that a 46980 protein is produced. In another embodiment, the method further includes isolating a 46980 protein from the medium or the host cell.

[2002] In another aspect, the invention features, a cell or purified preparation of cells which include a 46980 transgene, or which otherwise misexpress 46980. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 46980 transgene, e.g., a heterologous form of a 46980, e.g., a gene derived from humans (in the case of a non-human cell). The 46980 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 46980, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 46980 alleles or for use in drug screening.

[2003] In another aspect, the invention features, a human cell, e.g., a neuronal stem cell, transformed with nucleic acid which encodes a subject 46980 polypeptide.

[2004] Also provided are cells, preferably human cells, e.g., human neuronal cells or fibroblast cells, in which an endogenous 46980 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 46980 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 46980 gene. For example, an endogenous 46980 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[2005] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 46980 polypeptide, e.g., a soluable extracellular domain thereof, operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 46980 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 46980 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[2006] 46980 Transgenic Animals

[2007] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 46980 protein and for identifying and/or evaluating modulators of 46980 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 46980 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[2008] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 46980 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 46980 transgene in its genome and/or expression of 46980 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 46980 protein can further be bred to other transgenic animals carrying other transgenes.

[2009] 46980 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[2010] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[2011] Uses of 46980

[2012] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); c) isolation of neuronal cells; d) detection of 46980-expressing cells, e.g., neural cells, testicular cells, and cancerous derivatives of such cells; and e) methods of treatment (e.g., therapeutic and prophylactic).

[2013] The isolated nucleic acid molecules of the invention can be used, for example, to express a 46980 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 46980 mRNA (e.g., in a biological sample) or a genetic alteration in a 46980 gene, and to modulate 46980 activity, as described further below. The 46980 proteins can be used to treat disorders characterized by insufficient or excessive production of a 46980 substrate or production of 46980 inhibitors. In addition, the 46980 proteins can be used to screen for naturally occurring 46980 substrates, to screen for drugs or compounds which modulate 46980 activity, as well as to treat disorders characterized by insufficient or excessive production of 46980 protein or production of 46980 protein forms which have decreased, aberrant or unwanted activity compared to 46980 wild type protein (e.g., a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder,). Moreover, the anti-46980 antibodies of the invention can be used to detect and isolate 46980 proteins, regulate the bioavailability of 46980 proteins, and modulate 46980 activity. Further anti-46980 antibodies can be used to purify 46980-expressing cells, e.g., for diagnostics. In another application, solid supports having stably attached soluble 46980 extracellular domains can be used to purify a neurexin or neurexin-expressing cell.

[2014] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 46980 polypeptide is provided. The method includes: contacting the compound with the subject 46980 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 46980 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 46980 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 46980 polypeptide. Screening methods are discussed in more detail below.

[2015] 46980 Screening Assays

[2016] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 46980 proteins, have a stimulatory or inhibitory effect on, for example, 46980 expression or 46980 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 46980 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 46980 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[2017] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of a 46980 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 46980 protein or polypeptide or a biologically active portion thereof.

[2018] In one embodiment, an activity of a 46980 protein can be assayed by measuring binding to neurexin. For example, Ichtchenko et al., supra, provide a biochemical assay for binding of a neuroligin to an immobilized neurexin-IgG fusion protein. The biochemical interaction can be calcium dependent and can require lack of an insert in splice site four of the neurexin.

[2019] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial-library methods known in the art including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[2020] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[2021] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[2022] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 46980 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 46980 activity is determined. Determining the ability of the test compound to modulate 46980 activity can be accomplished by monitoring, for example, a protein binding or recognition function, e.g., neurexin binding or recognition. The cell, for example, can be of mammalian origin, e.g., human.

[2023] The ability of the test compound to modulate 46980 binding to a compound, e.g., a 46980 substrate, or to bind to 46980 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 46980 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 46980 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 46980 binding to a 46980 substrate in a complex. For example, compounds (e.g., 46980 substrates) can be labeled with ¹²⁵I, ³⁵S, ¹⁴C, or ³H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[2024] The ability of a compound (e.g., a 46980 substrate) to interact with 46980 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 46980 without the labeling of either the compound or the 46980. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 46980.

[2025] In yet another embodiment, a cell-free assay is provided in which a 46980 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 46980 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 46980 proteins to be used in assays of the present invention include fragments which participate in interactions with non-46980 molecules, e.g., fragments with high surface probability scores.

[2026] Soluble and/or membrane-bound forms of isolated proteins (e.g., 46980 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)_(n), 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[2027] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[2028] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[2029] In another embodiment, determining the ability of the 46980 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[2030] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[2031] It may be desirable to immobilize either 46980, an anti-46980 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 46980 protein, or interaction of a 46980 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/46980 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 46980 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 46980 binding or activity determined using standard techniques.

[2032] Other techniques for immobilizing either a 46980 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 46980 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[2033] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[2034] In one embodiment, this assay is performed utilizing antibodies reactive with 46980 protein or target molecules but which do not interfere with binding of the 46980 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 46980 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 46980 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 46980 protein or target molecule.

[2035] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11: 141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[2036] In a preferred embodiment, the assay includes contacting the 46980 protein or biologically active portion thereof with a known compound which binds 46980 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 46980 protein, wherein determining the ability of the test compound to interact with a 46980 protein includes determining the ability of the test compound to preferentially bind to 46980 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[2037] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 46980 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 46980 protein through modulation of the activity of a downstream effector of a 46980 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[2038] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[2039] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[2040] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[2041] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[2042] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[2043] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[2044] In yet another aspect, the 46980 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 46980 (“46980-binding proteins” or “46980-bp”) and are involved in 46980 activity. Such 46980-bps can be activators or inhibitors of signals by the 46980 proteins or 46980 targets as, for example, downstream elements of a 46980-mediated signaling pathway.

[2045] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 46980 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 46980 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 46980-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 46980 protein.

[2046] In another embodiment, modulators of 46980 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 46980 mRNA or protein evaluated relative to the level of expression of 46980 mRNA or protein in the absence of the candidate compound. When expression of 46980 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 46980 mRNA or protein expression. Alternatively, when expression of 46980 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 46980 mRNA or protein expression. The level of 46980 mRNA or protein expression can be determined by methods described herein for detecting 46980 mRNA or protein.

[2047] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 46980 protein can be confirmed in vivo, e.g., in an animal such as an animal model for a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder.

[2048] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 46980 modulating agent, an antisense 46980 nucleic acid molecule, a 46980-specific antibody, or a 46980-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[2049] 46980 Detection Assays

[2050] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 46980 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[2051] 46980 Chromosome Mapping

[2052] The 46980 nucleotide sequences or portions thereof can be used to map the location of the 46980 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 46980 sequences with genes associated with disease.

[2053] Briefly, 46980 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 46980 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 46980 sequences will yield an amplified fragment.

[2054] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[2055] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 46980 to a chromosomal location.

[2056] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[2057] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[2058] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[2059] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 46980 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[2060] 46980 Tissue Typing

[2061] 46980 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[2062] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 46980 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[2063] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:27 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:29 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[2064] If a panel of reagents from 46980 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[2065] Use of Partial 46980 Sequences in Forensic Biology

[2066] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[2067] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:27 (e.g., fragments derived from the noncoding regions of SEQ ID NO:27 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[2068] The 46980 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 46980 probes can be used to identify tissue by species and/or by organ type.

[2069] In a similar fashion, these reagents, e.g., 46980 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[2070] Predictive Medicine of 46980

[2071] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[2072] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 46980.

[2073] Such disorders include, e.g., a disorder associated with the misexpression of 46980 gene; a disorder of the neurological system.

[2074] The method includes one or more of the following:

[2075] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 46980 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[2076] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 46980 gene;

[2077] detecting, in a tissue of the subject, the misexpression of the 46980 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[2078] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 46980 polypeptide.

[2079] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 46980 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[2080] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:27, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 46980 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[2081] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 46980 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 46980.

[2082] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[2083] In preferred embodiments the method includes determining the structure of a 46980 gene, an abnormal structure being indicative of risk for the disorder.

[2084] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 46980 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[2085] Diagnostic and Prognostic Assays of 46980

[2086] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 46980 molecules and for identifying variations and mutations in the sequence of 46980 molecules.

[2087] Expression Monitoring and Profiling:

[2088] The presence, level, or absence of 46980 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 46980 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 46980 protein such that the presence of 46980 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 46980 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 46980 genes; measuring the amount of protein encoded by the 46980 genes; or measuring the activity of the protein encoded by the 46980 genes.

[2089] The level of mRNA corresponding to the 46980 gene in a cell can be determined both by in situ and by in vitro formats.

[2090] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 46980 nucleic acid, such as the nucleic acid of SEQ ID NO:27, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 46980 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[2091] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 46980 genes.

[2092] The level of mRNA in a sample that is encoded by one of 46980 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[2093] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 46980 gene being analyzed.

[2094] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 46980 mRNA, or genomic DNA, and comparing the presence of 46980 mRNA or genomic DNA in the control sample with the presence of 46980 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 46980 transcript levels.

[2095] A variety of methods can be used to determine the level of protein encoded by 46980. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)₂) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[2096] The detection methods can be used to detect 46980 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 46980 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 46980 protein include introducing into a subject a labeled anti-46980 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-46980 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[2097] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 46980 protein, and comparing the presence of 46980 protein in the control sample with the presence of 46980 protein in the test sample.

[2098] The invention also includes kits for detecting the presence of 46980 in a biological sample. For example, the kit can include a compound or agent capable of detecting 46980 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 46980 protein or nucleic acid.

[2099] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[2100] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[2101] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 46980 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder, or deregulated cell proliferation.

[2102] In one embodiment, a disease or disorder associated with aberrant or unwanted 46980 expression or activity is identified. A test sample is obtained from a subject and 46980 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 46980 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 46980 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[2103] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 46980 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a cell a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder, disorder.

[2104] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 46980 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 46980 (e.g., other genes associated with a 46980-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[2105] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 46980 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder, disorder in a subject wherein an alteration in 46980 expression is an indication that the subject has or is disposed to having a a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder. The method can be used to monitor a treatment for a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder, in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[2106] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 46980 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[2107] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 46980 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[2108] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[2109] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 46980 expression.

[2110] 46980 Arrays and Uses Thereof

[2111] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 46980 molecule (e.g., a 46980 nucleic acid or a 46980 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm², and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass-spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[2112] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 46980 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 46980. Each address of the subset can include a capture probe that hybridizes to a different region of a 46980 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 46980 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 46980 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 46980 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[2113] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[2114] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 46980 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 46980 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-46980 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[2115] In another aspect, the invention features a method of analyzing the expression of 46980. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 46980-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[2116] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 46980. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 46980. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[2117] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 46980 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[2118] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[2119] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 46980-associated disease or disorder; and processes, such as a cellular transformation associated with a 46980-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 46980-associated disease or disorder

[2120] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 46980) that could serve as a molecular target for diagnosis or therapeutic intervention.

[2121] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 46980 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 46980 polypeptide or fragment thereof. For example, multiple variants of a 46980 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[2122] The polypeptide array can be used to detect a 46980 binding compound, e.g., an antibody in a sample from a subject with specificity for a 46980 polypeptide or the presence of a 46980-binding protein or ligand.

[2123] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 46980 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[2124] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 46980 or from a cell or subject in which a 46980 mediated response has been elicited, e.g., by contact of the cell with 46980 nucleic acid or protein, or administration to the cell or subject 46980 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 46980 (or does not express as highly as in the case of the 46980 positive plurality of capture probes) or from a cell or subject which in which a 46980 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 46980 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[2125] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 46980 or from a cell or subject in which a 46980-mediated response has been elicited, e.g., by contact of the cell with 46980 nucleic acid or protein, or administration to the cell or subject 46980 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 46980 (or does not express as highly as in the case of the 46980 positive plurality of capture probes) or from a cell or subject which in which a 46980 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[2126] In another aspect, the invention features a method of analyzing 46980, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 46980 nucleic acid or amino acid sequence; comparing the 46980 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 46980.

[2127] Detection of 46980 Variations or Mutations

[2128] The methods of the invention can also be used to detect genetic alterations in a 46980 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 46980 protein activity or nucleic acid expression, such as a a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder, disorder. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 46980-protein, or the mis-expression of the 46980 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 46980 gene; 2) an addition of one or more nucleotides to a 46980 gene; 3) a substitution of one or more nucleotides of a 46980 gene, 4) a chromosomal rearrangement of a 46980 gene; 5) an alteration in the level of a messenger RNA transcript of a 46980 gene, 6) aberrant modification of a 46980 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 46980 gene, 8) a non-wild type level of a 46980-protein, 9) allelic loss of a 46980 gene, and 10) inappropriate post-translational modification of a 46980-protein.

[2129] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 46980-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 46980 gene under conditions such that hybridization and amplification of the 46980-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[2130] In another embodiment, mutations in a 46980 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[2131] In other embodiments, genetic mutations in 46980 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 46980 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 46980 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 46980 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[2132] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 46980 gene and detect mutations by comparing the sequence of the sample 46980 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[2133] Other methods for detecting mutations in the 46980 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[2134] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 46980 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[2135] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 46980 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 46980 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[2136] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[2137] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[2138] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[2139] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 46980 nucleic acid.

[2140] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:27 or the complement of SEQ ID NO:27. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[2141] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 46980. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[2142] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the T_(m) of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[2143] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 46980 nucleic acid.

[2144] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 46980 gene.

[2145] Use of 46980 Molecules as Surrogate Markers

[2146] The 46980 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 46980 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 46980 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[2147] The 46980 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 46980 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-46980 antibodies may be employed in an immune-based detection system for a 46980 protein marker, or 46980-specific radiolabeled probes may be used to detect a 46980 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[2148] The 46980 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 46980 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 46980 DNA may correlate 46980 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[2149] Pharmaceutical Compositions of 46980

[2150] The nucleic acid and polypeptides, fragments thereof, as well as anti-46980 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[2151] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[2152] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[2153] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[2154] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[2155] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[2156] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[2157] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[2158] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[2159] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[2160] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[2161] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography. As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[2162] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[2163] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e. including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[2164] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[2165] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, CC-1065, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine, vinblastine, taxol and maytansinoids). Radioactive ions include, but are not limited to iodine, yttrium and praseodymium.

[2166] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, α-interferon, β-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors. Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[2167] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[2168] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[2169] Methods of Treatment for 46980

[2170] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 46980 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[2171] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 46980 molecules of the present invention or 46980 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[2172] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 46980 expression or activity, by administering to the subject a 46980 or an agent which modulates 46980 expression or at least one 46980 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 46980 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 46980 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 46980 aberrance, for example, a 46980, 46980 agonist or 46980 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[2173] It is possible that some 46980 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[2174] In addition to the disorders mentioned above, the 46980 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, disorders associated with the thymus, disorders associated with bone metabolism, immune disorders, cardiovascular disorders, liver disorders, viral diseases, pain or metabolic disorders.

[2175] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[2176] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[2177] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin, e.g., arising from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease. Disorders involving the thymus include developmental disorders, such as DiGeorge syndrome with thymic hypoplasia or aplasia; thymic cysts; thymic hypoplasia, which involves the appearance of lymphoid follicles within the thymus, creating thymic follicular hyperplasia; and thymomas, including germ cell tumors, lynphomas, Hodgkin disease, and carcinoids. Thymomas can include benign or encapsulated thymoma, and malignant thymoma Type I (invasive thymoma) or Type II, designated thymic carcinoma.

[2178] Aberrant expression and/or activity of 46980 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 46980 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 46980 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 46980 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[2179] The 46980 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[2180] Examples of disorders involving the heart or “cardiovascular disorder” include, but are not limited to, a disease, disorder, or state involving the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. Examples of such disorders include hypertension, atherosclerosis, coronary artery spasm, congestive heart failure, coronary artery disease, valvular disease, arrhythmias, and cardiomyopathies.

[2181] Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[2182] Additionally, 46980 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 46980 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 46980 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[2183] Additionally, 46980 may play an important role in the regulation of metabolism or pain disorders. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[2184] As discussed, successful treatment of 46980 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 46980 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)₂ and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[2185] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[2186] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[2187] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 46980 expression is through the use of aptamer molecules specific for 46980 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 46980 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[2188] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 46980 disorders. For a description of antibodies, see the Antibody section above.

[2189] In circumstances wherein injection of an animal or a human subject with a 46980 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 46980 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 46980 protein. Vaccines directed to a disease characterized by 46980 expression may also be generated in this fashion.

[2190] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[2191] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 46980 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[2192] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED₅₀ with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC₅₀ (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography.

[2193] Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 46980 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 46980 can be readily monitored and used in calculations of IC₅₀.

[2194] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC₅₀. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[2195] Another aspect of the invention pertains to methods of modulating 46980 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 46980 or agent that modulates one or more of the activities of 46980 protein activity associated with the cell. An agent that modulates 46980 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 46980 protein (e.g., a 46980 substrate or receptor), a 46980 antibody, a 46980 agonist or antagonist, a peptidomimetic of a 46980 agonist or antagonist, or other small molecule.

[2196] In one embodiment, the agent stimulates one or 46980 activities. Examples of such stimulatory agents include active 46980 protein and a nucleic acid molecule encoding 46980. In another embodiment, the agent inhibits one or more 46980 activities. Examples of such inhibitory agents include antisense 46980 nucleic acid molecules, anti-46980 antibodies, and 46980 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 46980 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 46980 expression or activity. In another embodiment, the method involves administering a 46980 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 46980 expression or activity.

[2197] Stimulation of 46980 activity is desirable in situations in which 46980 is abnormally downregulated and/or in which increased 46980 activity is likely to have a beneficial effect. For example, stimulation of 46980 activity is desirable in situations in which a 46980 is downregulated and/or in which increased 46980 activity is likely to have a beneficial effect. Likewise, inhibition of 46980 activity is desirable in situations in which 46980 is abnormally upregulated and/or in which decreased 46980 activity is likely to have a beneficial effect.

[2198] 46980 Pharmacogenomics

[2199] The 46980 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 46980 activity (e.g., 46980 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 46980 associated disorders (e.g., a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder,) associated with aberrant or unwanted 46980 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 46980 molecule or 46980 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 46980 molecule or 46980 modulator.

[2200] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[2201] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[2202] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 46980 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[2203] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 46980 molecule or 46980 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[2204] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 46980 molecule or 46980 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[2205] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 46980 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 46980 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[2206] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 46980 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 46980 gene expression, protein levels, or upregulate 46980 activity, can be monitored in clinical trials of subjects exhibiting decreased 46980 gene expression, protein levels, or downregulated 46980 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 46980 gene expression, protein levels, or downregulate 46980 activity, can be monitored in clinical trials of subjects exhibiting increased 46980 gene expression, protein levels, or upregulated 46980 activity. In such clinical trials, the expression or activity of a 46980 gene, and preferably, other genes that have been implicated in, for example, a 46980-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[2207] 46980 Informatics

[2208] The sequence of a 46980 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 46980. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 46980 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[2209] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[2210] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[2211] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[2212] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[2213] Thus, in one aspect, the invention features a method of analyzing 46980, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 46980 nucleic acid or amino acid sequence; comparing the 46980 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 46980. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[2214] The method can include evaluating the sequence identity between a 46980 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[2215] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[2216] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[2217] Thus, the invention features a method of making a computer readable record of a sequence of a 46980 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[2218] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 46980 sequence, or record, in machine-readable form; comparing a second sequence to the 46980 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 46980 sequence includes a sequence being compared. In a preferred embodiment the 46980 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 46980 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[2219] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 46980-associated disease or disorder or a pre-disposition to a 46980-associated disease or disorder, wherein the method comprises the steps of determining 46980 sequence information associated with the subject and based on the 46980 sequence information, determining whether the subject has a 46980-associated disease or disorder or a pre-disposition to a 46980-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[2220] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 46980-associated disease or disorder or a pre-disposition to a disease associated with a 46980 wherein the method comprises the steps of determining 46980 sequence information associated with the subject, and based on the 46980 sequence information, determining whether the subject has a 46980-associated disease or disorder or a pre-disposition to a 46980-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 46980 sequence of the subject to the 46980 sequences in the database to thereby determine whether the subject as a 46980-associated disease or disorder, or a pre-disposition for such.

[2221] The present invention also provides in a network, a method for determining whether a subject has a 46980 associated disease or disorder or a pre-disposition to a 46980-associated disease or disorder associated with 46980, said method comprising the steps of receiving 46980 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 46980 and/or corresponding to a 46980-associated disease or disorder (e.g., a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder,), and based on one or more of the phenotypic information, the 46980 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 46980-associated disease or disorder or a pre-disposition to a 46980-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[2222] The present invention also provides a method for determining whether a subject has a 46980-associated disease or disorder or a pre-disposition to a 46980-associated disease or disorder, said method comprising the steps of receiving information related to 46980 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 46980 and/or related to a 46980-associated disease or disorder, and based on one or more of the phenotypic information, the 46980 information, and the acquired information, determining whether the subject has a 46980-associated disease or disorder or a pre-disposition to a 46980-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[2223] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 32225 Invention

[2224] The hydrolysis of chemical bonds is critical for most metabolic (e.g., catabolic and anabolic) pathways in cells. Hydrolases are a large class of enzymes which catalyze the cleavage of a bond with the addition of water. In particular, the α/β hydrolase family of enzymes is a phylogenetically diverse group of enzymes that have a common fold, typically comprising an eight-stranded β-sheet surrounded by α-helices (Ollis, D. et al. (1992) Protein Eng 5:197-211; Nardini and Dikkstra (1999) Curr Opin Str Bio 9:732-737). Members of the α/β hydrolase family are found in nearly all organisms, from microbes to plants to humans. Members of the hydrolase family of enzymes include enzymes that hydrolyze ester bonds (e.g., phosphatases, sulfatases, exonucleases, and endonucleases), glycosidases, enzymes that act on ether bonds, peptidases (e.g., exopeptidases and endopeptidases), as well as enzymes that hydrolyze carbon-nitrogen bonds, acid anhydrides, carbon-carbon bonds, halide bonds, phosphorous-nitrogen bonds, sulfur-nitrogen bonds, carbon-phosphorous bonds, and sulfur-sulfur bonds (E. C. Webb ed., Enzyme Nomenclature, pp. 306-450, ©1992 Academic Press, Inc. San Diego, Calif.).

[2225] α/β hydrolases vary widely in primary sequence, substrate specificity, and physical properties. However, despite the lack of sequence homology, hydrolase family members display structural similarities, e.g., conservation of a catalytic site framework. For example, the alpha/beta hydrolase fold is a structural motif that is common to a variety of hydrolytic enzymes including, lipases, e.g., fungal, bacterial and pancreatic lipases, acetylcholinesterases, serine carboxypeptidases, prolyl aminopeptidases, haloalkane dehalogenases, dienelactone hydrolases, A₂ bromoperoxidases, and thioesterases (Schrag, J. et al. (1997) Meth. Enzymol. 284:85-107). Of particular medical significance, α/β hydrolase family includes acetylcholinesterases, epoxide hydrolases, cholesterol esterases, and lipases. Inhibitors of acetylcholinesterase are useful therapeutic agents for the treatment of Alzheimer's disease, myasthenia gravis, and glaucoma. Epoxide hydrolases detoxify harmful aromatic compounds in mammals. The human hormone sensitive lipase catalyzes the rate-limiting reaction of hydrolyzing fat in adipocytes.

[2226] Enzymes possessing the α/β hydrolase fold have diverged from a common ancestor so as to preserve the arrangement of the catalytic residues (Ollis, D. et al. (1992) Protein Eng. 5:197-211). In particular, one conserved feature of the alpha/beta hydrolase fold is a nucleophile-histidine-acid catalytic triad. The identities of the triad residues in alpha/beta hydrolase fold enzymes are quite variable in that serine, aspartate, and cysteine have all been identified as catalytic nucleophiles (Schrag, J. et al. supra).

[2227] Hydrolases play important roles in the synthesis and breakdown of nearly all major metabolic intermediates, including polypeptides, nucleic acids, and lipids. As such, their activity contributes to the ability of the cell to grow and differentiate, to proliferate, to adhere and move, and to interact and communicate with other cells. Hydrolases also are important in the conversion of pro-proteins and pro-hormones to their active forms, the inactivation of peptides, the biotransformation of compounds (e.g., a toxin or carcinogen), antigen presentation, and the regulation of synaptic transmission.

Summary of the 32225 Invention

[2228] The present invention is based, in part, on the discovery of a novel α/β hydrolase family member, referred to herein as “32225”. The nucleotide sequence of a cDNA encoding 32225 is shown in SEQ ID NO:33, and the amino acid sequence of a 32225 polypeptide is shown in SEQ ID NO:34. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:35.

[2229] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 32225 protein or polypeptide, e.g., a biologically active portion of the 32225 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:34. In other embodiments, the invention provides isolated 32225 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:33, SEQ ID NO:35, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:33, SEQ ID NO:35, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:33, SEQ ID NO:35, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 32225 protein or an active fragment thereof.

[2230] In a related aspect, the invention further provides nucleic acid constructs that include a 32225 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 32225 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 32225 nucleic acid molecules and polypeptides.

[2231] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 32225-encoding nucleic acids.

[2232] In still another related aspect, isolated nucleic acid molecules that are antisense to a 32225 encoding nucleic acid molecule are provided.

[2233] In another aspect, the invention features, 32225 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 32225-mediated or -related disorders. In another embodiment, the invention provides 32225 polypeptides having a 32225 activity. Preferred polypeptides are 32225 proteins including at least one α/β hydrolase domain, and, preferably, having a 32225 activity, e.g., a 32225 activity as described herein.

[2234] In other embodiments, the invention provides 32225 polypeptides, e.g., a 32225 polypeptide having the amino acid sequence shown in SEQ ID NO:34 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:34 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:33, SEQ ID NO:35, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 32225 protein or an active fragment thereof.

[2235] In a related aspect, the invention further provides nucleic acid constructs which include a 32225 nucleic acid molecule described herein.

[2236] In a related aspect, the invention provides 32225 polypeptides or fragments operatively linked to non-32225 polypeptides to form fusion proteins.

[2237] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 32225 polypeptides or fragments thereof, e.g., a peptide loop that is involved in hydrolysis. In one embodiment, the antibodies or antigen-binding fragment thereof competitively inhibit the binding of a second antibody to a 32225 polypeptide or a fragment thereof, e.g., a peptide loop that is involved in hydrolysis.

[2238] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 32225 polypeptides or nucleic acids.

[2239] In still another aspect, the invention provides a process for modulating 32225 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 32225 polypeptides or nucleic acids, such as conditions involving aberrant or deficient breakdown of toxic molecules, neuronal function, or cellular proliferation and/or differentiation.

[2240] The invention also provides assays for determining the activity of or the presence or absence of 32225 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis.

[2241] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing, of a 32225-expressing cell, e.g., a hyper-proliferative 32225-expressing cell. The method includes contacting the cell with a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 32225 polypeptide or nucleic acid. In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol. In a preferred embodiment, the cell is a hyperproliferative cell, e.g., a cell found in a solid tumor, a soft tissue tumor, or a metastatic lesion, e.g., a tumor or lesion located in a lung, breast, colon, ovary, brain, or liver tissue.

[2242] In a preferred embodiment, the compound is an inhibitor of a 32225 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another preferred embodiment, the compound is an inhibitor of a 32225 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[2243] In a preferred embodiment, the compound is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[2244] In another embodiment, the compound is an activator or a 32225 polypeptide. Preferably, the activator is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody.

[2245] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant cellular proliferation or differentiation of a 32225-expressing cell, in a subject. Preferably, the method includes administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 32225 polypeptide or nucleic acid. In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition.

[2246] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., a proliferative disorder, a liver disorder, or a neural disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 32225 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 32225 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 32225 nucleic acid or polypeptide expression can be detected by any method described herein.

[2247] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 32225 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[2248] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 32225 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 32225 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 32225 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue, e.g., a cancerous lung, breast, colon, ovary, brain, or liver tissue.

[2249] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 32225 polypeptide or nucleic acid molecule, including for disease diagnosis.

[2250] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 32225 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 32225 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 32225 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[2251] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 32225

[2252] The human 32225 sequence (see SEQ ID NO:33, as recited in Example 24), which is approximately 2305 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 1029 nucleotides, including the termination codon. The coding sequence encodes a 342 amino acid protein (see SEQ ID NO:34, as recited in Example 24.

[2253] Human 32225 contains the following regions or other structural features:

[2254] an α/β hydrolase domain (PFAM Accession-Number-PF00561) located at about amino acid residues 95 to 338 of SEQ ID NO:34;

[2255] an α/β hydrolase domain nucleophile motif, located at about amino acid residues 144 to 149 of SEQ ID NO:34;

[2256] an a/b hydrolase domain catalytic histidine residue, located at about amino acid 320 of SEQ ID NO:34;

[2257] six predicted protein kinase C phosphorylation sites (PS00005) locate at about amino acids 19 to 21, 92 to 94, 108 to 110, 130 to 132, 156 to 158, and 297 to 299 of SEQ ID NO:34;

[2258] four predicted casein kinase II phosphorylation sites (PS00006) located at about amino acids 126 to 129, 130 to 133, 237 to 240, and 291 to 294 of SEQ ID NO:34;

[2259] two predicted N-myristylation sites (PS00008) located at about amino acids 78 to 83, and 148 to 153 of SEQ ID NO:34; and

[2260] one predicted amidation site (PS00009) located at about amino acid 297 to 300 of SEQ ID NO:34.

[2261] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[2262] A plasmid containing the nucleotide sequence encoding human 32225 (clone “Fbh32225FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[2263] The 32225 protein contains a significant number of structural characteristics in common with members of the α/β hydrolase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[2264] A hydrolase family of proteins includes a diverse array of enzymes and is characterized by a common fold. The domain fold typically contains an eight stranded β-sheet surrounded by α-helices, and frequently contains conserved catalytic residues. This enzyme family includes lipases, esterase, and proteases. Despite the variety of reactions this family is capable of, the chemistry of these reactions is generally similar. Three positions in particular form a catalytic triad contributing nucleophilic, acidic, and histidine residues. The side chains of these residues are critical for nucleophilic attack. Another conserved feature, an oxyanion hole near strand three is believed to stabilize potential covalent intermediates. This intermediate then proceeds to product by general base catalysis (Ollis, D. et al. (1992) Protein Eng 5:197-211). The α/β hydrolase family includes polypeptides with acetylcholinesterase, epoxide hydrolase, cholesterol esterase, and lipase activity. Acetylcholinesterase are important for neurotransmission; epoxide hydrolases for detoxifying harmful aromatic chemicals, and lipases for cholesterol and fat metabolism. Thus, the family of α/β hydrolases include enzymes critical for the proper function of many physiological systems.

[2265] A 32225 polypeptide can include an “α/β hydrolase domain” or regions homologous with an “α/β hydrolase domain”.

[2266] As used herein, the term “α/β hydrolase domain” includes an amino acid sequence of about 200 to 400 amino acid residues in length and having a bit score for the alignment of the sequence to the α/β hydrolase domain profile (Pfam HMM) of at least 40. Preferably, an α/β hydrolase domain includes at least about 220 to 380 amino acids, more preferably about 230 to 365 amino acid residues, or about 240 to 350 amino acids and has a bit score for the alignment of the sequence to the hydrolase domain (HMM) of at least 50, 60, 70, 75 or greater. The hydrolase domain (HMM) has been assigned the PFAM Accession Number PF00561 (http;//genome.wustl.edu/Pfam/.html). An alignment of the hydrolase domain (amino acids 95 to 338 of SEQ ID NO:34) of human 32225 with a consensus amino acid sequence (SEQ ID NO:36) derived from a hidden Markov model is depicted in FIG. 19.

[2267] In a preferred embodiment a 32225 polypeptide or protein has an “α/β hydrolase domain” or a region which includes at least about 220 to 380, more preferably about 230 to 365, or 240 to 350 amino acid residues and has at least about 50%, 60%, 70% 80% 90% 95%, 99%, or 100% homology with an “α/β hydrolase domain,” e.g., the hydrolase domain of human 32225 (e.g., residues 95 to 338 of SEQ ID NO:34).

[2268] To identify the presence of a “hydrolase” domain in a 32225 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against the Pfam database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of an “α/β hydrolase” domain in the amino acid sequence of human 32225 at about residues 95 to 338 of SEQ ID NO:34 (see FIG. 19).

[2269] Alternatively, to identify the presence of an “α/β hydrolase” domain in a 32225 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a different database of domains, e.g., the ProDom database (Corpet et al. (1999), Nucl. Acids Res. 27:263-267) The ProDom protein domain database consists of an automatic compilation of homologous domains. Current versions of ProDom are built using recursive PSI-BLAST searches (Altschul S F et al. (1997) Nucleic Acids Res. 25:3389-3402; Gouzy et al. (1999) Computers and Chemistry 23:333-340.) of the SWISS-PROT 38 and TREMBL protein databases. The database automatically generates a consensus sequence for each domain. A BLAST search was performed against the ProDom database resulting in the identification of several fragments of an “α/β hydrolase” domain in the amino acid sequence of human 32225, located at about amino acid residues 8 to 70, 187 to 277, and 280 to 342 of SEQ ID NO:34 (see FIGS. 20A-20C).

[2270] In one embodiment, a 32225 protein includes at least one α/β hydrolase domain nucleophile motif. As used herein, an “α/β hydrolase domain nucleophile motif” includes a sequence of at least five amino acid residues defined by the sequence: G-X-S-X-G-G (SEQ ID NO:40). An α/β hydrolase domain nucleophile motif, as defined, can be involved in the enzymatic hydrolysis of a chemical bond, e.g., the bond between a nucleophilic carbon atom and a relatively electrophilic atom to which the carbon is bound, e.g., an oxygen, nitrogen, or cloride atom. More preferably, an α/β hydrolase domain nucleophile motif includes six amino acid residues. α/β hydrolase domain nucleophili motifs have been described in, e.g., Ollis et al. (1992), supra, and Nardini and Dijkstra (1999), supra, the contents of which are incorporated herein by reference.

[2271] In a preferred embodiment, a 32225 polypeptide or protein has at least one α/β hydrolase domain nucleophile motif, or a region which includes at least five, or even six, amino acid residues and has at least 70%, 80%, 90%, or 100% homology with an “a/b hydrolase domain nucleophile motif”, e.g., at least one α/β hydrolase domain nucleophile motif of human 32225, e.g., amino acid residues 144 to 149 of SEQ ID NO:34.

[2272] In another preferred embodiment, a 32225 protein includes at least one α/β hydrolase domain catalytic histidine residue. As used herein, an “α/β hydrolase domain catalytic histidine residue” is a conserved histidine residue located toward the C-terminal end of an α/β hydrolase domain in a peptide loop that connects two elements of secondary structure, e.g., β-sheets and α-helices. An a/b hydrolase domain catalytic histidine residue, as defined, can be involved in the enzymatic hydrolysis of a chemical bond, e.g., the bond between a nucleophilic carbon atom and a relatively electrophilic atom to which the carbon is bound, e.g., an oxygen, nitrogen, or cloride atom. α/β hydrolase domain catalytic histidine residues have been described in, e.g., Ollis et al. (1992), supra, and Nardini and Dijkstra (1999), supra, the contents of which are incorporated herein by reference.

[2273] A 32225 family member can include at least one α/β hydrolase domain. Furthermore, a 32225 family member can include at least one α/β hydrolase domain nucleophile motif; at least one α/β hydrolase domain catalytic histidine residue; at least one, two, three, four, five, preferably six predicted protein kinase C phosphorylation sites (PS00005); at least one, two, three, preferably four predicted casein kinase II phosphorylation sites (PS00006); at least one, preferably two predicted N-myristylation sites (PS00008); and at least one predicted amidation site (PS00009).

[2274] As the 32225 polypeptides of the invention may modulate 32225-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 32225-mediated or related disorders, as described below.

[2275] As used herein, a “32225 activity”, “biological activity of 32225” or “functional activity of 32225”, refers to an activity exerted by a 32225 protein, polypeptide or nucleic acid molecule. For example, a 32225 activity can be an activity exerted by 32225 in a physiological milieu on, e.g., a 32225-responsive cell or on a 32225 substrate, e.g., a small molecule substrate. A 32225 activity can be determined in vivo or in vitro. In one embodiment, a 32225 activity is a direct activity, such as an association with a 32225 target molecule. A “target molecule” or “binding partner” is a molecule with which a 32225 protein binds or interacts in nature. In an exemplary embodiment, 32225 is an enzyme for a substrate that contains a nucleophilic carbon atom bound to a relatively electrophilic atom, e.g., an oxygen, nitrogen, or chlorine atom. Such substrates are present in many small molecules, e.g., hydroxymuconic semialdehyde, 1,2-dichloroethane, acetycholine, and triacylglycerols.

[2276] A 32225 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 32225 protein with a 32225 receptor. The features of the 32225 molecules of the present invention can provide similar biological activities as a/p hydrolase family members. For example, the 32225 proteins of the present invention can have one or more of the following activities: (1) hydrolysis of acetylcholine and other neurotransmitters; (2) hydrolysis of epoxides and other toxic chemicals; (3) hydrolysis of lipid substrates; (4) hydrolysis of cholesterol; (5) protease activity; and (6) thioesterase activity. In addition, the 32225 protein may have a critical function in one or more of the following physiological processes: (1) neurotransmitter function; (2) detoxification; (3) metabolite regulation and degradation; and (4) toxin removal and neutralization. Selected 32225 polypeptides can antagonize, e.g., by competitive inhibition, one of the above-mentioned activities.

[2277] Thus, the 32225 molecules can act as novel diagnostic targets and therapeutic agents for controlling cell growth and/or differentiative disorders, neurological disorders, liver disorders, metabolic disorders, and cardiovascular and/or endothelial cell disorders.

[2278] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[2279] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth. Examples of such cells include cells having an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[2280] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[2281] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[2282] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[2283] Examples of cellular proliferative and/or differentiative disorders of the colon include, but are not limited to, non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors.

[2284] Examples of cellular proliferative and/or differentiative disorders of the liver include, but are not limited to, nodular hyperplasias, adenomas, and malignant tumors, including primary carcinoma of the liver and metastatic tumors.

[2285] Examples of cellular proliferative and/or differentiative disorders of the breast include, but are not limited to, proliferative breast disease including, e.g., epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors, e.g., stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[2286] Examples of cellular proliferative and/or differentiative disorders of the lung include, but are not limited to, bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[2287] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin. A hematopoietic neoplastic disorder can arise from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[2288] 32225 polypeptide may be involved with neuron outgrowth, central nervous system (CNS) development, psychiatric function, and neuronal repair. Neurological disorders include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, A/DS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B1) deficiency and vitamin B12 deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[2289] Disorders of the liver include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. Additional liver disorders include hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[2290] 32225 may have a critical role in removing xenobiotic epoxides and other toxins from the body. Furthermore, it may contribute to the metabolism of drugs and other pharmaceuticals. The study of polymorphisms in the 32225 gene should provide a useful resource for pharmacogenomic (see below) analysis of drug responses. Additionally, variations in 32225 may contribute to population differences in sensitivity to environmental toxins.

[2291] Additionally, 32225 may play an important role in the regulation of metabolism Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders diabetes. For example, many α/β hydrolase family members have lipase activity and cholesterol esterase activity. The human hormone sensitive lipase is responsible for metabolizing fat stored in adipocytes.

[2292] As used herein, disorders involving the heart, or “cardiovascular disease” or a “cardiovascular disorder” includes a disease or disorder which affects the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. A cardiovascular disorder includes, but is not limited to disorders such as arteriosclerosis, atherosclerosis, cardiac hypertrophy, ischemia reperfusion injury, restenosis, arterial inflammation, vascular wall remodeling, ventricular remodeling, rapid ventricular pacing, coronary microembolism, tachycardia, bradycardia, pressure overload, aortic bending, coronary artery ligation, vascular heart disease, valvular disease, including but not limited to, valvular degeneration caused by calcification, rheumatic heart disease, endocarditis, or complications of artificial valves; atrial fibrillation, long-QT syndrome, congestive heart failure, sinus node dysfunction, angina, heart failure, hypertension, atrial fibrillation, atrial flutter, pericardial disease, including but not limited to, pericardial effusion and pericarditis; cardiomyopathies, e.g., dilated cardiomyopathy or idiopathic cardiomyopathy, myocardial infarction, coronary artery disease, coronary artery spasm, ischemic disease, arrhythmia, sudden cardiac death, and cardiovascular developmental disorders (e.g., arteriovenous malformations, arteriovenous fistulae, raynaud's syndrome, neurogenic thoracic outlet syndrome, causalgia/reflex sympathetic dystrophy, hemangioma, aneurysm, cavernous angioma, aortic valve stenosis, atrial septal defects, atrioventricular canal, coarctation of the aorta, ebsteins anomaly, hypoplastic left heart syndrome, interruption of the aortic arch, mitral valve prolapse, ductus arteriosus, patent foramen ovale, partial anomalous pulmonary venous return, pulmonary atresia with ventricular septal defect, pulmonary atresia without ventricular septal defect, persistance of the fetal circulation, pulmonary valve stenosis, single ventricle, total anomalous pulmonary venous return, transposition of the great vessels, tricuspid atresia, truncus arteriosus, ventricular septal defects). A cardiovasular disease or disorder also can include an endothelial cell disorder.

[2293] As used herein, an “endothelial cell disorder” includes a disorder characterized by aberrant, unregulated, or unwanted endothelial cell activity, e.g., proliferation, migration, angiogenesis, or vascularization; or aberrant expression of cell surface adhesion molecules or genes associated with angiogenesis, e.g., TIE-2, FLT and FLK. Endothelial cell disorders include tumorigenesis, tumor metastasis, psoriasis, diabetic retinopathy, endometriosis, Grave's disease, ischemic disease (e.g., atherosclerosis), and chronic inflammatory diseases (e.g., rheumatoid arthritis).

[2294] The 32225 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:34 thereof are collectively referred to as “polypeptides or proteins of the invention” or “32225 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “32225 nucleic acids.” 32225 molecules refer to 32225 nucleic acids, polypeptides, and antibodies.

[2295] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[2296] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[2297] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[2298] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO:33 or SEQ ID NO:35, corresponds to a naturally-occurring nucleic acid molecule.

[2299] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein. As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding a 32225 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 32225 protein or derivative thereof.

[2300] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 32225 protein is at least 10% pure. In a preferred embodiment, the preparation of 32225 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-32225 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-32225 chemicals. When the 32225 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[2301] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 32225 without abolishing or substantially altering a 32225 activity. Preferably the alteration does not substantially alter the 32225 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 32225, results in abolishing a 32225 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 32225 are predicted to be particularly unamenable to alteration.

[2302] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 32225 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 32225 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 32225 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:33 or SEQ ID NO:35, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[2303] As used herein, a “biologically active portion” of a 32225 protein includes a fragment of a 32225 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 32225 molecule and a non-32225 molecule or between a first 32225 molecule and a second 32225 molecule (e.g., a dimerization interaction). Biologically active portions of a 32225 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 32225 protein, e.g., the amino acid sequence shown in SEQ ID NO:34, which include less amino acids than the full length 32225 proteins, and exhibit at least one activity of a 32225 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 32225 protein, e.g., hydrolysis of a substrate molecule, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols. A biologically active portion of a 32225 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 32225 protein can be used as targets for developing agents which modulate a 32225 mediated activity, e.g., hydrolysis of a substrate molecule, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols.

[2304] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[2305] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[2306] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[2307] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[2308] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[2309] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 32225 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 32225 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[2310] Particular 32225 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:34. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:34 are termed substantially identical.

[2311] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:33 or 35 are termed substantially identical.

[2312] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[2313] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[2314] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[2315] Various aspects of the invention are described in further detail below.

[2316] Isolated Nucleic Acid Molecules of 32225

[2317] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 32225 polypeptide described herein, e.g., a full-length 32225 protein or a fragment thereof, e.g., a biologically active portion of 32225 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 32225 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[2318] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:33, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 32225 protein (i.e., “the coding region” of SEQ ID NO:33, as shown in SEQ ID NO:35), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:33 (e.g., SEQ ID NO:35) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein from about amino acid 95 to 338.

[2319] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:33 or SEQ ID NO:35, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:33 or SEQ ID NO:35, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:33 or 35, thereby forming a stable duplex.

[2320] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:33 or SEQ ID NO:35, or a portion, preferably of the same length, of any of these nucleotide sequences.

[2321] 32225 Nucleic Acid Fragments

[2322] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:33 or 35. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of a 32225 protein, e.g., an immunogenic or biologically active portion of a 32225 protein. A fragment can comprise those nucleotides of SEQ ID NO:33, which encode an α/β hydrolase domain of human 32225. The nucleotide sequence determined from the cloning of the 32225 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 32225 family members, or fragments thereof, as well as 32225 homologues, or fragments thereof, from other species.

[2323] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 120 amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention, e.g., BE007972, BE007963, BE008013, or SEQ ID NO:74 from WO 99/06548.

[2324] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 32225 nucleic acid fragment can include a sequence corresponding to an α/β hydrolase domain.

[2325] 32225 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:33 or SEQ ID NO:35, or of a naturally occurring allelic variant or mutant of SEQ ID NO:33 or SEQ ID NO:35.

[2326] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[2327] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes an α/β hydrolase domain, e.g., about nucleotides 315 to 1058 of SEQ ID NO:33.

[2328] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 32225 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: an α/β hydrolase domain, from about amino acid residues 95 to 342 of SEQ ID NO:34; a fragment of an α/β hydrolase domain that includes the catalytic histidine residue, from about amino acid residues 285 to 342 of SEQ ID NO:34; or a fragment of an a/b hydrolase domain that includes a nucleophile motif, from about amino acid residues 130 to 170 of SEQ ID NO:34.

[2329] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[2330] A nucleic acid fragment encoding a “biologically active portion of a 32225 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:33 or 35, which encodes a polypeptide having a 32225 biological activity (e.g., the biological activities of the 32225 proteins are described herein), expressing the encoded portion of the 32225 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 32225 protein. For example, a nucleic acid fragment encoding a biologically active portion of 32225 includes an α/β hydrolase domain, e.g., amino acid residues about 95 to 338 of SEQ ID NO:34, or fragments of an α/β hydrolase domain, e.g., about amino acid residues 135 to 155, 285 to 305, or 315 to 330 of SEQ ID NO:34. A nucleic acid fragment encoding a biologically active portion of a 32225 polypeptide, may comprise a nucleotide sequence which is greater than 300, 350, 400, or more nucleotides in length.

[2331] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300 or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:33, or SEQ ID NO:35. In a preferred embodiment, a nucleic acid includes at least one contiguous nucleotide from the region of about nucleotides 1-200, 33-314, 251-450, 315-500, 400-600, 501-750, 601-850, 751-1000, 851-1058, 900-1200, 1059-1500, 1201-1700, 1400-2000, 1650-2100, 1800-2200, 1950-2305.

[2332] 32225 Nucleic Acid Variants

[2333] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:33 or SEQ ID NO:35. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 32225 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:34. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[2334] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[2335] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[2336] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:33 or 35, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[2337] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO:34 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO:34 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 32225 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 32225 gene.

[2338] Preferred variants include those that are correlated with the ability to hydrolyze a chemical bond, e.g., a chemical bond present in a substrate, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols.

[2339] Allelic variants of 32225, e.g., human 32225, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 32225 protein within a population that maintain the ability to hydrolyze a substrate molecule, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:34, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 32225, e.g., human 32225, protein within a population that do not have the ability to hydrolyze a chemical bond present in a substrate molecule, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:34, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[2340] Moreover, nucleic acid molecules encoding other 32225 family members and, thus, which have a nucleotide sequence which differs from the 32225 sequences of SEQ ID NO:33 or SEQ ID NO:35 are intended to be within the scope of the invention.

[2341] Antisense Nucleic Acid Molecules, Ribozymes and Modified 32225 Nucleic Acid Molecules

[2342] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 32225. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 32225 coding strand, or to only a portion thereof (e.g., the coding region of human 32225 corresponding to SEQ ID NO:35). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 32225 (e.g., the 5′ and 3′untranslated regions).

[2343] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 32225 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 32225 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 32225 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[2344] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[2345] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 32225 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[2346] In yet another embodiment, the antisense nucleic acid molecule of the invention is an α-anomeric nucleic acid molecule. An α-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual β-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[2347] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 32225-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 32225 cDNA disclosed herein (i.e., SEQ ID NO:33 or SEQ ID NO:35), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 32225-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 32225 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[2348] 32225 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 32225 (e.g., the 32225 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 32225 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[2349] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or calorimetric.

[2350] A 32225 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[2351] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[2352] PNAs of 32225 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 32225 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[2353] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[2354] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 32225 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 32225 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[2355] Isolated 32225 Polypeptides

[2356] In another aspect, the invention features, an isolated 32225 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-32225 antibodies. 32225 protein can be isolated from cells or tissue sources using standard protein purification techniques. 32225 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[2357] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[2358] In a preferred embodiment, a 32225 polypeptide has one or more of the following characteristics:

[2359] (i) it has the ability to hydrolyze a chemical bond present in a substrate molecule, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols;

[2360] (ii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of SEQ ID NO:34;

[2361] (iii) it has an overall sequence similarity of at least 50%, preferably at least 60%, more preferably at least 70, 80, 90, or 95%, with a polypeptide a of SEQ ID NO:34;

[2362] (iv) it can be found in brain, liver, breast, or ovary;

[2363] (v) it has an α/β hydrolase domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues about 95 to 338 of SEQ ID NO:34;

[2364] (vi) it has an α/β hydrolase domain nucleophile motif;

[2365] (vii) it has an α/β hydrolase domain catalytic histidine residue;

[2366] (viii) it has one, two, three, four, five, preferably six predicted Protein Kinase C phosphorylation sites (PS00005);

[2367] (ix) it has one, two, three, preferably four predicted Casein Kinase II phosphorylation sites (PS00006);

[2368] (x) it has one, preferably two predicted N-myristoylation sites (PS00008); and

[2369] (xi) it has one predicted amidation site (PS00009).

[2370] In a preferred embodiment the 32225 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:34. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:34 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:34. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the α/β hydrolase domain, e.g., located at about amino acid residues 95 to 342 of SEQ ID NO:34. In another preferred embodiment one or more differences are in the α/β hydrolase domain, e.g., located at about amino acid residues 95 to 342 of SEQ ID NO:34.

[2371] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 32225 proteins differ in amino acid sequence from SEQ ID NO:34, yet retain biological activity.

[2372] In one embodiment, the protein includes an amino acid sequence at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:34.

[2373] A 32225 protein or fragment is provided which varies from the sequence of SEQ ID NO:34 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment. A 32225 protein or fragment is provided which varies from the sequence of SEQ ID NO:34 e.g., in regions defined by amino acids about 1 to 140 and 151 to 280 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment, but which does not differ from SEQ ID NO:34 in regions defined by amino acids about 141 to 150 and 281 to 342. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[2374] In one embodiment, a biologically active portion of a 32225 protein includes an α/β hydrolase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 32225 protein.

[2375] In a preferred embodiment, the 32225 protein has an amino acid sequence shown in SEQ ID NO:34. In other embodiments, the 32225 protein is substantially identical to SEQ ID NO:34. In yet another embodiment, the 32225 protein is substantially identical to SEQ ID NO:34 and retains the functional activity of the protein of SEQ ID NO:34, as described in detail in the subsections above.

[2376] 32225 Chimeric or Fusion Proteins

[2377] In another aspect, the invention provides 32225 chimeric or fusion proteins. As used herein, a 32225 “chimeric protein” or “fusion protein” includes a 32225 polypeptide linked to a non-32225 polypeptide. A “non-32225 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 32225 protein, e.g., a protein which is different from the 32225 protein and which is derived from the same or a different organism. The 32225 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 32225 amino acid sequence. In a preferred embodiment, a 32225 fusion protein includes at least one (or two) biologically active portion of a 32225 protein. The non-32225 polypeptide can be fused to the N-terminus or C-terminus of the 32225 polypeptide.

[2378] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-32225 fusion protein in which the 32225 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 32225. Alternatively, the fusion protein can be a 32225 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 32225 can be increased through use of a heterologous signal sequence.

[2379] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[2380] The 32225 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 32225 fusion proteins can be used to affect the bioavailability of a 32225 substrate. 32225 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 32225 protein; (ii) mis-regulation of the 32225 gene; and (iii) aberrant post-translational modification of a 32225 protein.

[2381] Moreover, the 32225-fusion proteins of the invention can be used as immunogens to produce anti-32225 antibodies in a subject, to purify 32225 ligands and in screening assays to identify molecules which inhibit the interaction of 32225 with a 32225 substrate.

[2382] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 32225-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 32225 protein.

[2383] Variants of 32225 Proteins

[2384] In another aspect, the invention also features a variant of a 32225 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 32225 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 32225 protein. An agonist of the 32225 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 32225 protein. An antagonist of a 32225 protein can inhibit one or more of the activities of the naturally occurring form of the 32225 protein by, for example, competitively modulating a 32225-mediated activity of a 32225 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 32225 protein.

[2385] Variants of a 32225 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 32225 protein for agonist or antagonist activity.

[2386] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 32225 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 32225 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[2387] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 32225 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 32225 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[2388] Cell based assays can be exploited to analyze a variegated 32225 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 32225 in a substrate-dependent manner. The transfected cells are then contacted with 32225 and the effect of the expression of the mutant on signaling by the 32225 substrate can be detected, e.g., by measuring the hydrolysis of a substrate molecule, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 32225 substrate, and the individual clones further characterized.

[2389] In another aspect, the invention features a method of making a 32225 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 32225 polypeptide, e.g., a naturally occurring 32225 polypeptide. The method includes: altering the sequence of a 32225 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[2390] In another aspect, the invention features a method of making a fragment or analog of a 32225 polypeptide a biological activity of a naturally occurring 32225 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 32225 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[2391] Anti-32225 Antibodies

[2392] In another aspect, the invention provides an anti-32225 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[2393] The anti-32225 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[2394] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[2395] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 32225 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-32225 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)₂ fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[2396] The anti-32225 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[2397] Phage display and combinatorial methods for generating anti-32225 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[2398] In one embodiment, the anti-32225 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[2399] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[2400] An anti-32225 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[2401] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[2402] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 32225 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto. As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[2403] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 32225 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[2404] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[2405] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[2406] In preferred embodiments an antibody can be made by immunizing with purified 32225 antigen, or a fragment thereof, e.g., a fragment described herein, membrane associated antigen, tissue, e.g., crude tissue preparations, whole cells, preferably living cells, lysed cells, or cell fractions, e.g., cytosolic fractions.

[2407] A full-length 32225 protein or, antigenic peptide fragment of 32225 can be used as an immunogen or can be used to identify anti-32225 antibodies made with other immunogens, e.g., cells, cytosolic fractions, and the like. The antigenic peptide of 32225 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:34 and encompasses an epitope of 32225. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[2408] Fragments of 32225 which include residues about 106 to 123, about 220 to 236, or about 299 to 316 can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody, antibodies against hydrophilic regions of the 32225 protein. Similarly, fragments of 32225 which include residues about 71 to 88, about 135 to 157, or about 186 to 199 can be used to make an antibody against a hydrophobic region of the 32225 protein; and a fragment of 32225 which include residues about 130 to 155, about 280 to 300, or about 310 to 342 can be used to make an antibody against the α/β hydrolase region of the 32225 protein.

[2409] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[2410] Antibodies which bind only native 32225 protein, only denatured or otherwise non-native 32225 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 32225 protein.

[2411] Preferred epitopes encompassed by the antigenic peptide are regions of 32225 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 32225 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 32225 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[2412] In preferred embodiments antibodies can bind one or more of purified antigen, membrane associated antigen, tissue, e.g., tissue sections, whole cells, preferably living cells, lysed cells, cell fractions, e.g., cytosolic fractions.

[2413] The anti-32225 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 32225 protein.

[2414] In a preferred embodiment the antibody has effector function and can fix complement. In other embodiments the antibody does not recruit effector cells or fix complement.

[2415] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example., it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[2416] In a preferred embodiment, an anti-32225 antibody alters (e.g., increases or decreases) the hydrolase activity of a 32225 polypeptide. For example, the antibody can bind at or in proximity to the active site, e.g., to an epitope that includes a residue located from about 310 to 342 of SEQ ID NO:34.

[2417] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[2418] An anti-32225 antibody (e.g., monoclonal antibody) can be used to isolate 32225 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-32225 antibody can be used to detect 32225 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-32225 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or ³H.

[2419] The invention also includes a nucleic acids which encodes an anti-32225 antibody, e.g., an anti-32225 antibody described herein. Also included are vectors which include the nucleic acid and sells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[2420] The invention also includes cell lines, e.g., hybridomas, which make an anti-32225 antibody, e.g., and antibody described herein, and method of using said cells to make a 32225 antibody.

[2421] 32225 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[2422] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[2423] A vector can include a 32225 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 32225 proteins, mutant forms of 32225 proteins, fusion proteins, and the like).

[2424] The recombinant expression vectors of the invention can be designed for expression of 32225 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[2425] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[2426] Purified fusion proteins can be used in 32225 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 32225 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[2427] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[2428] The 32225 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[2429] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[2430] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[2431] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the α-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[2432] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[2433] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 32225 nucleic acid molecule within a recombinant expression vector or a 32225 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[2434] A host cell can be any prokaryotic or eukaryotic cell. For example, a 32225 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells (African green monkey kidney cells CV-1 origin SV40 cells; Gluzman (1981) Cell123:175-182)). Other suitable host cells are known to those skilled in the art.

[2435] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[2436] A host cell of the invention can be used to produce (i.e., express) a 32225 protein. Accordingly, the invention further provides methods for producing a 32225 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 32225 protein has been introduced) in a suitable medium such that a 32225 protein is produced. In another embodiment, the method further includes isolating a 32225 protein from the medium or the host cell.

[2437] In another aspect, the invention features, a cell or purified preparation of cells which include a 32225 transgene, or which otherwise misexpress 32225. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 32225 transgene, e.g., a heterologous form of a 32225, e.g., a gene derived from humans (in the case of a non-human cell). The 32225 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 32225, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 32225 alleles or for use in drug screening.

[2438] In another aspect, the invention features, a human cell, e.g., a hepatic or hematopoietic stem cell, transformed with nucleic acid which encodes a subject 32225 polypeptide.

[2439] Also provided are cells, preferably human cells, e.g., human hepatic, hematopoietic or fibroblast cells, in which an endogenous 32225 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 32225 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 32225 gene. For example, an endogenous 32225 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[2440] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 32225 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 32225 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 32225 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[2441] 32225 Transgenic Animals

[2442] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 32225 protein and for identifying and/or evaluating modulators of 32225 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 32225 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[2443] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 32225 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 32225 transgene in its genome and/or expression of 32225 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 32225 protein can further be bred to other transgenic animals carrying other transgenes.

[2444] 32225 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[2445] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[2446] Uses of 32225

[2447] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[2448] The isolated nucleic acid molecules of the invention can be used, for example, to express a 32225 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 32225 mRNA (e.g., in a biological sample) or a genetic alteration in a 32225 gene, and to modulate 32225 activity, as described further below. 32225 proteins can be used, in vitro, to hydrolyze substrate compounds as part of a synthetic process. The 32225 proteins can also be used to treat disorders characterized by insufficient or excessive production of a 32225 substrate or production of 32225 inhibitors. In addition, the 32225 proteins can be used to screen for naturally occurring 32225 substrates, to screen for drugs or compounds which modulate 32225 activity, as well as to treat disorders characterized by insufficient or excessive production of 32225 protein or production of 32225 protein forms which have decreased, aberrant or unwanted activity compared to 32225 wild type protein (e.g., cellular proliferative and/or differentiative disorders, neural disorders, liver disorders, metabolic disorders, or cardiovascular disorders). Moreover, the anti-32225 antibodies of the invention can be used to detect and isolate 32225 proteins, regulate the bioavailability of 32225 proteins, and modulate 32225 activity.

[2449] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 32225 polypeptide is provided. The method includes: contacting the compound with the subject 32225 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 32225 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 32225 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 32225 polypeptide. Screening methods are discussed in more detail below.

[2450] 32225 Screening Assays

[2451] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 32225 proteins, have a stimulatory or inhibitory effect on, for example, 32225 expression or 32225 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 32225 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 32225 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[2452] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of a 32225 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 32225 protein or polypeptide or a biologically active portion thereof.

[2453] In one embodiment, an activity of a 32225 protein can be assayed in vitro. First, 32225 protein can be expressed in a bacterial cell and then purified, e.g., by means of an covalently attached affinity tag, e.g., a His-6 tag. Purified 32225 can then be incubated in buffer, e.g., 100 mM Tris-HCL, pH 8.5, along with a substrate molecule, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols. By measuring the optical density of the solution at various time points, the loss of substrate or increase in product can be monitored, which is a direct measure of 32225 activity. The appropriate wavelength used to measure optical density will depend upon the light absorption spectra of the substrate and product molecules. An example of such as assay, used to monitor the activity of a 2-Hydroxymuconic Semialdehyde Dehydrogenase enzyme, is provided in Inoue et al. (1995), J of Bacteriology 177(5):1196-1201, the contents of which are incorporated herein by reference.

[2454] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[2455] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[2456] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[2457] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 32225 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 32225 activity is determined. Determining the ability of the test compound to modulate 32225 activity can be accomplished by monitoring, for example, the hydrolysis of a substrate molecule, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols. The cell, for example, can be of mammalian origin, e.g., human.

[2458] The ability of the test compound to modulate 32225 binding to a compound, e.g., a 32225 substrate, or to bind to 32225 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 32225 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 32225 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 32225 binding to a 32225 substrate in a complex. For example, compounds (e.g., 32225 substrates) can be labeled with ¹²⁵I, ³⁵S, ¹⁴C, or ³H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[2459] The ability of a compound (e.g., a 32225 substrate) to interact with 32225 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 32225 without the labeling of either the compound or the 32225. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 32225.

[2460] In yet another embodiment, a cell-free assay is provided in which a 32225 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 32225 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 32225 proteins to be used in assays of the present invention include fragments which participate in interactions with non-32225 molecules, e.g., fragments with high surface probability scores.

[2461] Soluble and/or membrane-bound forms of isolated proteins (e.g., 32225 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)_(n), 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[2462] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[2463] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[2464] In another embodiment, determining the ability of the 32225 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[2465] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[2466] It may be desirable to immobilize either 32225, an anti-32225 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 32225 protein, or interaction of a 32225 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/32225 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 32225 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 32225 binding or activity determined using standard techniques.

[2467] Other techniques for immobilizing either a 32225 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 32225 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[2468] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[2469] In one embodiment, this assay is performed utilizing antibodies reactive with 32225 protein or target molecules but which do not interfere with binding of the 32225 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 32225 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 32225 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 32225 protein or target molecule.

[2470] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11: 141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[2471] In a preferred embodiment, the assay includes contacting the 32225 protein or biologically active portion thereof with a known compound which binds 32225 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 32225 protein, wherein determining the ability of the test compound to interact with a 32225 protein includes determining the ability of the test compound to preferentially bind to 32225 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[2472] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 32225 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 32225 protein through modulation of the activity of a downstream effector of a 32225 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[2473] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[2474] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[2475] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[2476] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[2477] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[2478] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[2479] In yet another aspect, the 32225 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 32225 (“32225-binding proteins” or “32225-bp”) and are involved in 32225 activity. Such 32225-bps can be activators or inhibitors of signals by the 32225 proteins or 32225 targets as, for example, downstream elements of a 32225-mediated signaling pathway.

[2480] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 32225 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 32225 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 32225-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 32225 protein.

[2481] In another embodiment, modulators of 32225 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 32225 mRNA or protein evaluated relative to the level of expression of 32225 mRNA or protein in the absence of the candidate compound. When expression of 32225 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 32225 mRNA or protein expression. Alternatively, when expression of 32225 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 32225 mRNA or protein expression. The level of 32225 mRNA or protein expression can be determined by methods described herein for detecting 32225 mRNA or protein.

[2482] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 32225 protein can be confirmed in vivo, e.g., in an animal such as an animal model for a cellular proliferative and/or differentiative disorder, a neural disorder, a liver disorder, a metabolic disorder, or a cardiovascular disorder.

[2483] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 32225 modulating agent, an antisense 32225 nucleic acid molecule, a 32225-specific antibody, or a 32225-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[2484] 32225 Detection Assays

[2485] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 32225 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[2486] 32225 Chromosome Mapping

[2487] The 32225 nucleotide sequences or portions thereof can be used to map the location of the 32225 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 32225 sequences with genes associated with disease.

[2488] Briefly, 32225 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 32225 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 32225 sequences will yield an amplified fragment.

[2489] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[2490] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 32225 to a chromosomal location.

[2491] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[2492] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[2493] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[2494] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 32225 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[2495] 32225 Tissue Typing

[2496] 32225 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[2497] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 32225 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[2498] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:33 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:35 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[2499] If a panel of reagents from 32225 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[2500] Use of Partial 32225 Sequences in Forensic Biology

[2501] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[2502] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:33 (e.g., fragments derived from the noncoding regions of SEQ ID NO:33 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[2503] The 32225 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 32225 probes can be used to identify tissue by species and/or by organ type.

[2504] In a similar fashion, these reagents, e.g., 32225 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[2505] Predictive Medicine of 32225

[2506] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[2507] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 32225.

[2508] Such disorders include, e.g., a disorder associated with the misexpression of 32225 gene; a disorder of the lung, breast, colon, ovary, brain, or liver; a disorder characterized by unwanted cell proliferation, e.g., cancer, of the lung, breast, colon, ovary, brain, or liver.

[2509] The method includes one or more of the following:

[2510] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 32225 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[2511] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 32225 gene;

[2512] detecting, in a tissue of the subject, the misexpression of the 32225 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[2513] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 32225 polypeptide.

[2514] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 32225 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[2515] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:33, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 32225 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[2516] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 32225 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 32225.

[2517] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[2518] In preferred embodiments the method includes determining the structure of a 32225 gene, an abnormal structure being indicative of risk for the disorder.

[2519] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 32225 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[2520] Diagnostic and Prognostic Assays of 32225

[2521] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 32225 molecules and for identifying variations and mutations in the sequence of 32225 molecules.

[2522] Expression Monitoring and Profiling:

[2523] The presence, level, or absence of 32225 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 32225 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 32225 protein such that the presence of 32225 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum or lung, breast, colon, ovary, brain, or liver tissue. The level of expression of the 32225 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 32225 genes; measuring the amount of protein encoded by the 32225 genes; or measuring the activity of the protein encoded by the 32225 genes.

[2524] The level of mRNA corresponding to the 32225 gene in a cell can be determined both by in situ and by in vitro formats.

[2525] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 32225 nucleic acid, such as the nucleic acid of SEQ ID NO:33, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 32225 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[2526] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 32225 genes.

[2527] The level of mRNA in a sample that is encoded by one of 32225 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[2528] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 32225 gene being analyzed.

[2529] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 32225 mRNA, or genomic DNA, and comparing the presence of 32225 mRNA or genomic DNA in the control sample with the presence of 32225 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 32225 transcript levels.

[2530] A variety of methods can be used to determine the level of protein encoded by 32225. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)₂) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[2531] The detection methods can be used to detect 32225 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 32225 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 32225 protein include introducing into a subject a labeled anti-32225 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-32225 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[2532] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 32225 protein, and comparing the presence of 32225 protein in the control sample with the presence of 32225 protein in the test sample.

[2533] The invention also includes kits for detecting the presence of 32225 in a biological sample. For example, the kit can include a compound or agent capable of detecting 32225 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 32225 protein or nucleic acid.

[2534] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[2535] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[2536] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 32225 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as neurological disorder, a liver disorder, a metabolic disorder, a cardiovascular disorder, or deregulated cell proliferation.

[2537] In one embodiment, a disease or disorder associated with aberrant or unwanted 32225 expression or activity is identified. A test sample is obtained from a subject and 32225 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 32225 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 32225 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[2538] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 32225 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a neurological disorder, a liver disorder, a metabolic disorder, a cardiovascular disorder, or a cell proliferative and/or differentiative disorder.

[2539] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 32225 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 32225 (e.g., other genes associated with a 32225-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[2540] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 32225 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a cellular proliferative disorder in a subject wherein an increase in 32225 expression is an indication that the subject has or is disposed to having a cellular proliferative disorder, e.g., lung cancer, or some forms of breast, colon, ovary, or brain cancer. Alternatively, a decrease in 32225 expression could be an indication that the subject has or is predisposed to having a cellular proliferative disorder, e.g., some form of ovary or brain cancer. The method can be used to monitor a treatment for a cellular proliferative disorder in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[2541] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 32225 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[2542] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 32225 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[2543] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[2544] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 32225 expression.

[2545] 32225 Arrays and Uses Thereof

[2546] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 32225 molecule (e.g., a 32225 nucleic acid or a 32225 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm², and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[2547] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 32225 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 32225. Each address of the subset can include a capture probe that hybridizes to a different region of a 32225 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 32225 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 32225 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 32225 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[2548] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[2549] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 32225 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 32225 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-32225 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[2550] In another aspect, the invention features a method of analyzing the expression of 32225. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 32225-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[2551] In another embodiment, the array can be used to assay gene expression in a tissue, lung, breast, colon, ovary, brain, or liver tisue, to ascertain tissue specificity of genes in the array, particularly the expression of 32225. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 32225. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[2552] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 32225 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[2553] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[2554] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 32225-associated disease or disorder; and processes, such as a cellular transformation associated with a 32225-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 32225-associated disease or disorder

[2555] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 32225) that could serve as a molecular target for diagnosis or therapeutic intervention.

[2556] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 32225 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 32225 polypeptide or fragment thereof. For example, multiple variants of a 32225 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[2557] The polypeptide array can be used to detect a 32225 binding compound, e.g., an antibody in a sample from a subject with specificity for a 32225 polypeptide or the presence of a 32225-binding protein or ligand.

[2558] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 32225 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[2559] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 32225 or from a cell or subject in which a 32225 mediated response has been elicited, e.g., by contact of the cell with 32225 nucleic acid or protein, or administration to the cell or subject 32225 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 32225 (or does not express as highly as in the case of the 32225 positive plurality of capture probes) or from a cell or subject which in which a 32225 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 32225 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[2560] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 32225 or from a cell or subject in which a 32225-mediated response has been elicited, e.g., by contact of the cell with 32225 nucleic acid or protein, or administration to the cell or subject 32225 nucleic acid or protein, or a normal or diseased lung, breast, colon, ovary, brain, or liver tissue; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 32225 (or does not express as highly as in the case of the 32225 positive plurality of capture probes) or from a cell or subject which in which a 32225 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[2561] In another aspect, the invention features a method of analyzing 32225, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 32225 nucleic acid or amino acid sequence; comparing the 32225 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 32225.

[2562] Detection of 32225 Variations or Mutations

[2563] The methods of the invention can also be used to detect genetic alterations in a 32225 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 32225 protein activity or nucleic acid expression, such as a neurological disorder, a liver disorder, a metabolic disorder, a cardiovascular disorder, or deregulated cell proliferation. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 32225-protein, or the mis-expression of the 32225 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 32225 gene; 2) an addition of one or more nucleotides to a 32225 gene; 3) a substitution of one or more nucleotides of a 32225 gene, 4) a chromosomal rearrangement of a 32225 gene; 5) an alteration in the level of a messenger RNA transcript of a 32225 gene, 6) aberrant modification of a 32225 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 32225 gene, 8) a non-wild type level of a 32225-protein, 9) allelic loss of a 32225 gene, and 10) inappropriate post-translational modification of a 32225-protein.

[2564] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 32225-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 32225 gene under conditions such that hybridization and amplification of the 32225-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[2565] In another embodiment, mutations in a 32225 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[2566] In other embodiments, genetic mutations in 32225 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 32225 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 32225 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 32225 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[2567] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 32225 gene and detect mutations by comparing the sequence of the sample 32225 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[2568] Other methods for detecting mutations in the 32225 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[2569] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 32225 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[2570] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 32225 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 32225 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[2571] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[2572] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[2573] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[2574] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 32225 nucleic acid.

[2575] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:33 or the complement of SEQ ID NO:33. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[2576] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 32225. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[2577] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the T_(m) of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[2578] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 32225 nucleic acid.

[2579] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 32225 gene.

[2580] Use of 32225 Molecules as Surrogate Markers

[2581] The 32225 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 32225 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 32225 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[2582] The 32225 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 32225 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-32225 antibodies may be employed in an immune-based detection system for a 32225 protein marker, or 32225-specific radiolabeled probes may be used to detect a 32225 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[2583] The 32225 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 32225 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 32225 DNA may correlate 32225 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[2584] Pharmaceutical Compositions of 32225

[2585] The nucleic acid and polypeptides, fragments thereof, as well as anti-32225 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[2586] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[2587] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifingal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[2588] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[2589] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[2590] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[2591] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[2592] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[2593] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[2594] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[2595] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[2596] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[2597] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[2598] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[2599] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[2600] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[2601] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, CC-1065, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine, vinblastine, taxol and maytansinoids). Radioactive ions include, but are not limited to iodine, yttrium and praseodymium.

[2602] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, α-interferon, β-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[2603] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[2604] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[2605] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[2606] Methods of Treatment for 32225

[2607] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 32225 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[2608] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 32225 molecules of the present invention or 32225 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[2609] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 32225 expression or activity, by administering to the subject a 32225 or an agent which modulates 32225 expression or at least one 32225 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 32225 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 32225 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 32225 aberrance, for example, a 32225, 32225 agonist or 32225 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[2610] It is possible that some 32225 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[2611] The 32225 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, neural disorders, liver disorders, metabolic disorders, or cardiovascular disorders, as discussed above. In addition, the 32225 molecules of the invention may act as novel diagnostic targets and therapeutic agents for controlling disorders associated with bone metabolism, immune disorders, viral diseases, or pain disorders.

[2612] Aberrant expression and/or activity of 32225 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 32225 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 32225 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 32225 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[2613] The 32225 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[2614] 32225 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 32225 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 32225 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[2615] Additionally, 32225 may play an important role in the regulation of pain disorders. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[2616] As discussed, successful treatment of 32225 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 32225 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)₂ and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[2617] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[2618] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[2619] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 32225 expression is through the use of aptamer molecules specific for 32225 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 32225 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[2620] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 32225 disorders. For a description of antibodies, see the Antibody section above.

[2621] In circumstances wherein injection of an animal or a human subject with a 32225 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 32225 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 32225 protein. Vaccines directed to a disease characterized by 32225 expression may also be generated in this fashion.

[2622] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[2623] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 32225 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[2624] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED₅₀ with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC₅₀ (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography.

[2625] Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 32225 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 32225 can be readily monitored and used in calculations of IC₅₀.

[2626] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC₅₀. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[2627] Another aspect of the invention pertains to methods of modulating 32225 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 32225 or agent that modulates one or more of the activities of 32225 protein activity associated with the cell. An agent that modulates 32225 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 32225 protein (e.g., a 32225 substrate or receptor), a 32225 antibody, a 32225 agonist or antagonist, a peptidomimetic of a 32225 agonist or antagonist, or other small molecule.

[2628] In one embodiment, the agent stimulates one or 32225 activities. Examples of such stimulatory agents include active 32225 protein and a nucleic acid molecule encoding 32225. In another embodiment, the agent inhibits one or more 32225 activities. Examples of such inhibitory agents include antisense 32225 nucleic acid molecules, anti-32225 antibodies, and 32225 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 32225 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 32225 expression or activity. In another embodiment, the method involves administering a 32225 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 32225 expression or activity.

[2629] Stimulation of 32225 activity is desirable in situations in which 32225 is abnormally downregulated and/or in which increased 32225 activity is likely to have a beneficial effect. For example, stimulation of 32225 activity is desirable in situations in which a 32225 is downregulated and/or in which increased 32225 activity is likely to have a beneficial effect. Likewise, inhibition of 32225 activity is desirable in situations in which 32225 is abnormally upregulated and/or in which decreased 32225 activity is likely to have a beneficial effect.

[2630] 32225 Pharmacogenomics

[2631] The 32225 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 32225 activity (e.g., 32225 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 32225 associated disorders, e.g., neurological disorder, a liver disorder, a metabolic disorder, a cardiovascular disorder, or cellular proliferatative disorders, associated with aberrant or unwanted 32225 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 32225 molecule or 32225 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 32225 molecule or 32225 modulator.

[2632] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[2633] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[2634] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 32225 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[2635] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 32225 molecule or 32225 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[2636] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 32225 molecule or 32225 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[2637] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 32225 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 32225 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[2638] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 32225 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 32225 gene expression, protein levels, or upregulate 32225 activity, can be monitored in clinical trials of subjects exhibiting decreased 32225 gene expression, protein levels, or downregulated 32225 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 32225 gene expression, protein levels, or downregulate 32225 activity, can be monitored in clinical trials of subjects exhibiting increased 32225 gene expression, protein levels, or upregulated 32225 activity. In such clinical trials, the expression or activity of a 32225 gene, and preferably, other genes that have been implicated in, for example, a 32225-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[2639] 32225 Informatics

[2640] The sequence of a 32225 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 32225. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 32225 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[2641] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[2642] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[2643] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[2644] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[2645] Thus, in one aspect, the invention features a method of analyzing 32225, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 32225 nucleic acid or amino acid sequence; comparing the 32225 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 32225. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[2646] The method can include evaluating the sequence identity between a 32225 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[2647] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[2648] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[2649] Thus, the invention features a method of making a computer readable record of a sequence of a 32225 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[2650] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 32225 sequence, or record, in machine-readable form; comparing a second sequence to the 32225 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 32225 sequence includes a sequence being compared. In a preferred embodiment the 32225 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 32225 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[2651] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 32225-associated disease or disorder or a pre-disposition to a 32225-associated disease or disorder, wherein the method comprises the steps of determining 32225 sequence information associated with the subject and based on the 32225 sequence information, determining whether the subject has a 32225-associated disease or disorder or a pre-disposition to a 32225-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[2652] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 32225-associated disease or disorder or a pre-disposition to a disease associated with a 32225 wherein the method comprises the steps of determining 32225 sequence information associated with the subject, and based on the 32225 sequence information, determining whether the subject has a 32225-associated disease or disorder or a pre-disposition to a 32225-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 32225 sequence of the subject to the 32225 sequences in the database to thereby determine whether the subject as a 32225-associated disease or disorder, or a pre-disposition for such.

[2653] The present invention also provides in a network, a method for determining whether a subject has a 32225 associated disease or disorder or a pre-disposition to a 32225-associated disease or disorder associated with 32225, said method comprising the steps of receiving 32225 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 32225 and/or corresponding to a 32225-associated disease or disorder (e.g., a neurological disorder, a liver disorder, a metabolic disorder, a cardiovascular disorder, or a cellular proliferative and/or differentiative disorder), and based on one or more of the phenotypic information, the 32225 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 32225-associated disease or disorder or a pre-disposition to a 32225-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[2654] The present invention also provides a method for determining whether a subject has a 32225-associated disease or disorder or a pre-disposition to a 32225-associated disease or disorder, said method comprising the steps of receiving information related to 32225 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 32225 and/or related to a 32225-associated disease or disorder, and based on one or more of the phenotypic information, the 32225 information, and the acquired information, determining whether the subject has a 32225-associated disease or disorder or a pre-disposition to a 32225-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[2655] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 47508 Invention

[2656] Chromatin is a complex of DNA and proteins. The protein component of chromatin includes five different proteins, known collectively as histones. The histones are designated H1, H2A, H2B, H3 and H4 and are highly conserved across genera. Two of each of the histones H2A, H2B, H3, and H4 form a core octomer complex around which DNA is wrapped, or spooled, leading to its condensation.

[2657] Histones undergo numerous alterations during the cell cycle (reviewed in Mahlknecht and Hoelzer (2000), Molecular Medicine 6(8):623-44). For example, histone H4 is acted upon by a battery of enzymes that carry out various covalent modifications. In particular, four lysine residues in the N-terminal region of histone H4 undergo reversible acetylation. This enzymatic reaction is catalyzed by a histone acetylase, which adds acetyl groups donated by acetyl CoA. The acetyl groups are removed by a hydrolysis reaction catalyzed by a histone deacetylase. Histones H2A, H2B1, and H3 are likewise acetylated and deacetylated.

[2658] Acetylation and other covalent modifications alter the net charge of histones. For example, the unmodified histone H4 has a net charge of +5, while the fully modified protein has a net charge of −2. This change in net charge alters the affinity of histones or particular histone domains for DNA and for other proteins. Histone acetylation, which shifts the net charge of histones in the negative direction, leads to the unraveling of chromatin, which is often accompanied by an increase in gene transcription in the unraveled region. Conversely, histone deacetylation shifts the net charge of histones in the positive direction, strengthening the interaction between histones and DNA, and leading to chromosomal condensation. Chromosomal condensation is associated with the silencing of DNA transcription. Thus, the enzymes involved in the covalent modification of histones, histone acetylases and histone deacetylases, play a regulatory role in DNA replication and transcription. Through these activities, histone acetylases and deacetylases are believed to influence major cellular processes, such as differentiation and proliferation, as well as abnormal processes like tumor growth (Mahlknecht and Hoelzer (2000), supra).

Summary of the 47508 Invention

[2659] The present invention is based, in part, on the discovery of a novel histone deacetylase family member, referred to herein as “47508”. The nucleotide sequence of a cDNA encoding 47508 is recited in SEQ ID NO:41, and the amino acid sequence of a 47508 polypeptide is recited in SEQ ID NO:42 (see also Example 29, below). In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:43.

[2660] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 47508 protein or polypeptide, e.g., a biologically active portion of the 47508 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:42. In other embodiments, the invention provides isolated 47508 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:41, SEQ ID NO:43, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:41, SEQ ID NO:43, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:41, SEQ ID NO:43, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 47508 protein or an active fragment thereof.

[2661] In a related aspect, the invention further provides nucleic acid constructs that include a 47508 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 47508 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 47508 nucleic acid molecules and polypeptides.

[2662] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 47508-encoding nucleic acids.

[2663] In still another related aspect, isolated nucleic acid molecules that are antisense to a 47508 encoding nucleic acid molecule are provided.

[2664] In another aspect, the invention features, 47508 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 47508-mediated or -related disorders. In another embodiment, the invention provides 47508 polypeptides having a 47508 activity. Preferred polypeptides are 47508 proteins including at least one histone deacetylase domain, and, preferably, having a 47508 activity, e.g., a 47508 activity as described herein.

[2665] In other embodiments, the invention provides 47508 polypeptides, e.g., a 47508 polypeptide having the amino acid sequence shown in SEQ ID NO:42 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:42 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:41, SEQ ID NO:43, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a fill length 47508 protein or an active fragment thereof.

[2666] In a related aspect, the invention provides 47508 polypeptides or fragments operatively linked to non-47508 polypeptides to form fusion proteins.

[2667] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 47508 polypeptides or fragments thereof, e.g., a histone deacetylase domain of 47508.

[2668] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 47508 polypeptides or nucleic acids.

[2669] In still another aspect, the invention provides a process for modulating 47508 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 47508 polypeptides or nucleic acids, such as conditions involving aberrant or deficient cellular proliferation or differentiation.

[2670] The invention also provides assays for determining the activity of or the presence or absence of 47508 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis.

[2671] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing, of a 47508-expressing cell, e.g., a hyper-proliferative 47508-expressing cell. The method includes contacting the cell with an agent, e.g., a compound, (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 47508 polypeptide or nucleic acid. In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol. Preferably, the cell is a hyperproliferative cell, e.g., a cell found in a solid tumor, a soft tissue tumor, or a metastatic lesion. In another preferred embodiment, the cell or tumor is found in, e.g., a breast, ovary, lung, colon, or liver, tissue.

[2672] In a preferred embodiment, the compound is an inhibitor of a 47508 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another preferred embodiment, the compound is an inhibitor of a 47508 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[2673] In a preferred embodiment, the compound is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[2674] In another embodiment, the compound is an activator of a 47508 polypeptide. Preferably, the activator is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule, and an antibody. In another embodiment, the compound is an activator of a 47508 nucleic acid.

[2675] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant cellular proliferation or differentiation of a 47508-expressing cell, in a subject. Preferably, the method includes administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 47508 polypeptide or nucleic acid. In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition.

[2676] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., a proliferative disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 47508 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 47508 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 47508 nucleic acid or polypeptide expression can be detected by any method described herein.

[2677] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 47508 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[2678] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, or a cytotoxic agent) and, evaluating the expression of 47508 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 47508 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 47508 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue, e.g., a cancerous breast, ovary, lung, colon, or liver tissue.

[2679] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 47508 polypeptide or nucleic acid molecule, including for disease diagnosis.

[2680] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 47508 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 47508 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 47508 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[2681] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 47508

[2682] The human 47508 sequence (see SEQ ID NO:41, as recited in Example 29), which is approximately 1579 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 1242 nucleotides, including the termination codon. The coding sequence encodes a 413 amino acid protein (see SEQ ID NO:42, as recited in Example 29).

[2683] Human 47508 contains the following regions or other structural features:

[2684] a histone deacetylase domain (PFAM accession number PF00850) located at about amino acid residues 83 to 392 of SEQ ID NO:42;

[2685] a histidine deacetylase zinc-binding triad having two conserved aspartic acid residues, located at about amino acid residues 247 and 327 of SEQ ID NO:42, and one conserved histidine residue, located at about amino acid residue 249 of SEQ ID NO:42;

[2686] a first charge-relay system formed by a conserved histidine residue, located at about amino acid residue 208 of SEQ ID NO:42, and a conserved aspartic acid residue, located at about amino acid residue 245 of SEQ ID NO:42;

[2687] a second charge-relay system formed by a conserved histidine residue, located at about amino acid residue 209 of SEQ ID NO:42, and a conservatively substituted asparagines residue, located at about amino acid residue 251 of SEQ ID NO:42.

[2688] four Protein Kinase C phosphorylation sites (PS00005) located at about amino acid residues 87 to 89, 142 to 144, 212 to 214, and 374 to 376 of SEQ ID NO:42;

[2689] five Casein Kinase II phosphorylation sites (PS00006) located at about amino acid residues 133 to 136, 157 to 160, 242 to 245, 295 to 298, and 311 to 314 of SEQ ID NO:42;

[2690] seven N-myristylation sites (PS00008) located at about amino acid residues 2 to 7, 37 to 42, 55 to 60, 66 to 71, 186 to 191, 215 to 220, and 237 to 242 of SEQ ID NO:42; and

[2691] one amidation site (PS00009) located at about amino acid residues 356 to 359.

[2692] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[2693] A plasmid containing the nucleotide sequence encoding human 47508 (clone “Fbh47508FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[2694] The 47508 protein contains a significant number of structural characteristics in common with members of the histone deacetylase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[2695] A histone deacetylase family of proteins is characterized by a common fold, which is a single domain open α/β fold. The fold consists of a central eight-stranded parallel β-sheet with four α-helices packed against each face of the β-sheet. There are an additional eight α-helices, most of which are clustered near one edge of the α-sheet. Together with loops that arise from the carboxy-terminal ends of the β-strands of the β-sheet, the additional helices help form a deep, narrow pocket and an internal cavity adjacent to the pocket. At the bottom of the pocket, two aspartic acid residues and a histidine residue are involved in cooridinating a zinc ion. The bottom of the pocket contains an additional five highly conserved residues, including two histidine residues, two aspartic acid residues, and a tyrosine residue. Each of the histidine residues interacts with an aspartic acid residue, creating a charge-relay pair and thereby increasing the basicity of the imidazole Nε atom. It is believed that the substrate lysine residue, which is to be deacetylated, inserts into the pocket such that the acetylated end of the lysine residue coordinates with the zinc ion, and the aliphatic carbon atoms of the lysine substrate form van der Waals contacts with conserved hydrophobic residues that line the wall of the pocket. Once the acetylated lysine residue is positioned correctly, the two histidine/aspartic acid charge-relay pairs help catalyze the removal of the acetyl group, giving rise to a deacetylated lysine residue and a free acetate molecule. The structure of a histone deacetylase domain has been described, e.g., in Finnin et al. (1999), Nature 401(6749):188-93, the contents of which are incorporated herein by reference.

[2696] A 47508 polypeptide can include a “histone deacetylase domain” or regions homologous with a “histone deacetylase domain”.

[2697] As used herein, the term “histone deacetylase domain” includes an amino acid sequence of about 250 to 500 amino acid residues in length and having a bit score for the alignment of the sequence to the histone deacetylase domain profile (Pfam HMM) of at least 50. Preferably, a histone deacetylase domain includes at least about 275 to 450 amino acids, more preferably about 300 to 425 amino acid residues, or about 305 to 400 amino acids and has a bit score for the alignment of the sequence to the histone deacetylase domain (HMM) of at least 75, 80, 82, 83, 84, or greater. The histone deacetylase domain (HMM) has been assigned the PFAM Accession Number PF00850 (http;//genome.wustl.edu/Pfam/.html). An alignment of the histone deacetylase domain (amino acids 83 to 392 of SEQ ID NO:42) of human 47508 with a consensus amino acid sequence (SEQ ID NO:44) derived from a hidden Markov model is depicted in FIG. 22.

[2698] In a preferred embodiment 47508 polypeptide or protein has a “histone deacetylase domain” or a region which includes at least about 250 to 500, more preferably about 275 to 450, or 305 to 400 amino acid residues and has at least about 50%, 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “histone deacetylase domain,” e.g., the histone deacetylase domain of human 47508 (e.g., residues 83 to 392 of SEQ ID NO:42).

[2699] To identify the presence of a “histone deacetylase” domain in a 47508 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against the Pfam database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of a “histone deacetylase” domain in the amino acid sequence of human 47508 at about residues 83 to 392 of SEQ ID NO:42 (see FIG. 22).

[2700] Alternatively, to identify the presence of a “histone deacetylase” domain in a 47508 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of domains, e.g., the ProDom database (Corpet et al. (1999), Nucl. Acids Res. 27:263-267) The ProDom protein domain database consists of an automatic compilation of homologous domains. Current versions of ProDom are built using recursive PSI-BLAST searches (Altschul S F et al. (1997) Nucleic Acids Res. 25:3389-3402; Gouzy et al. (1999) Computers and Chemistry 23:333-340.) of the SWISS-PROT 38 and TREMBL protein databases. The database automatically generates a consensus sequence for each domain. A BLAST search was performed against the ProDom database resulting in the identification of three portions of a “histone deacetylase” domain in the amino acid sequence of human 47508, located at about residues 71 to 115, 120 to 258, and 251 to 371 of SEQ ID NO:42 (see FIGS. 23A-23C). Taken together, the three portions of a histone deacetylase domain which were identified in human 47508, by blasting the 47508 sequence against the ProDomain database, constitute a domain that has similar length and endpoints (about amino acid residues 71 to 372 of SEQ ID NO:42) as the histone deacetylase domain identified in human 47508 by searching the PFAM database (about amino acid residues 83 to 392 of SEQ ID NO:42). Furthermore, functionally important amino acid residues (described below) that display conservation in the PFAM alignment (see FIG. 22) display similar conservation in the ProDomain alignements (see FIGS. 23A-23C), and the bit score for two of the three ProDomain alignments (FIGS. 23A and 23C, with bit scores of 85.3 and 143.8, respectively) actually exceeds the bit score for the PFAM alignment (FIG. 22, with a bit score of 84.6).

[2701] In one embodiment, a 47508 protein includes at least one histone deacetylase zinc-binding triad. As used herein, a “histone deacetylase zinc-binding triad” includes two conserved aspartic acid residues and a conserved histidine residue located at the bottom of the substrate-binding pocket of a histone deacetylase domain. A “histone deacetylase zinc-binding triad”, as defined, can be involved in the coordination of an ion, e.g., a zinc or cobalt ion, and through said ion, in the interaction with an acetylated lysine substrate. Histone deacetylase zinc-binding triads have been described in Finnin et al. (1999), supra, the contents of which are incorporated herein by reference.

[2702] In a preferred embodiment, a 47508 polypeptide or protein has at least one histone deacetylase zinc-binding triad, or a sequence in which not more that one of the two aspartic acid residues is conservatively substituted, e.g., substituted with asparagine, glutamic acid, or glutamine, while the other aspartic acid residue and the histidine residue are absolutely conserved. The residues of human 47508 that constitute the histone deacetylase zinc-binding triad are the aspartic acid residues located at about amino acid residues 247 and 327 of SEQ ID NO:42 and the histidine residue located at about amino acid residue 249 of SEQ ID NO:42.

[2703] In another embodiment, a 47508 protein includes at least two charge-relay systems. As used herein, a “charge-relay system” includes a conserved histidine residue and either an aspartic acid residue or an asparagine residue, which interact with one another such that the imidazole Nε atom of the histidine residue has an increased basicity, and the two residues are located at the bottom of the substrate binding pocket of a histone deacetylase domain. A “charge-relay system”, as defined, can be involved in the enzymatic deacetylation of an acetylated lysine residue. Charge-relay systems, as found in histone deacetylases, have been described in Finnin et al. (1999), supra, the contents of which are incorporated herein by reference.

[2704] In a preferred embodiment, a 47508 polypeptide or protein includes at least two charge-relay systems, or at least two conserved histidine residues that each interact with a second amino acid residue selected from the group of aspartic acid, asparagines, glutamine, and glutamic acid, whereby the imidazole Nε atom of the histidine residue has an increased basicity. The residues of human 47508 that constitute the two charge-relay systems are: 1) the histidine residue located at about amino acid residue 208 of SEQ ID NO:42 and the aspartic acid residue located at about amino acid residue 247 of SEQ ID NO:42; and 2) the histidine residue located at about amino acid residue 209 of SEQ ID NO:42 and the asparagine residue located at about amino acid residue 252 of SEQ ID NO:42.

[2705] A 47508 family member can include at least one histone deacetylase domain. Furthermore, a 47508 family member can include at least one histone deacetylase zinc-binding triad; at least two charge-relay systems; at least one, two, three, preferably four predicted protein kinase C phosphorylation sites (PS00005); at least one, two, three, four, and preferably five predicted casein kinase II phosphorylation sites (PS00006); at least one, two, three, four, five, six, and preferably seven predicted N-myristylation sites (PS00008); and at least one predicted amidation site (PS00009).

[2706] As the 47508 polypeptides of the invention may modulate 47508-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 47508-mediated or related disorders, as described below.

[2707] As used herein, a “47508 activity”, “biological activity of 47508” or “functional activity of 47508”, refers to an activity exerted by a 47508 protein, polypeptide or nucleic acid molecule. For example, a 47508 activity can be an activity exerted by 47508 in a physiological milieu on, e.g., a 47508-responsive cell or on a 47508 substrate, e.g., a protein substrate. A 47508 activity can be determined in vivo or in vitro. In one embodiment, a 47508 activity is a direct activity, such as an association with a 47508 target molecule. A “target molecule” or “binding partner” is a molecule with which a 47508 protein binds or interacts in nature. In an exemplary embodiment, 47508 is an enzyme for that catalyzes the removal of an acetyl group from an acetylated lysine residue of a substrate protein, e.g., a histone protein, e.g., H2A, H2B, H3, or H4.

[2708] A 47508 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 47508 protein with a 47508 receptor. The features of the 47508 molecules of the present invention can provide similar biological activities as histone deacetylase family members. For example, the 47508 proteins of the present invention can have one or more of the following activities: (1) catalytic removal of acetyl groups from proteins, e.g., histones, e.g., histone H2A, H2B, H3, or H4; (2) catalytic removal of acetyl groups from lysine residues present in proteins, e.g., histones, e.g., histone H2A, H2B, H3, or H4; (3) regulation of the association of histiones with DNA; (4) regulation of chromosomal condensation; (5) interaction with transcription factors, e.g., transcriptional repressors; (6) regulation of transcription, e.g., transcriptional repression; (7) regulation of cellular differentiation, e.g., suppression or induction of cellular differentiation; (8) regulation of the cell cycle, e.g., cell-cycle progression or arrest; (9) regulation of tumor growth; (10) localizes to the nucleus; and (11) can be inhibited by histone deacetylase inhibitors, e.g., sodum butyrate, tricostatin, suberoylanilide hydroxamic acid, MS-27-275, or FR901228.

[2709] Thus, the 47508 molecules can act as novel diagnostic targets and therapeutic agents for controlling cellular proliferative and/or differentiative disorders, as well as disorders of the breast, ovary, lung, colon, or liver, and cardiovascular disorders.

[2710] The 47508 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, disorders associated with bone metabolism, immune disorders (e.g., inflammatory disorders), cardiovascular disorders, liver disorders, viral diseases, pain or metabolic disorders.

[2711] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[2712] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth. Examples of such cells include cells having an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[2713] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[2714] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[2715] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[2716] Examples of cellular proliferative and/or differentiative disorders of the colon include, but are not limited to, non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors.

[2717] Examples of cellular proliferative and/or differentiative disorders of the liver include, but are not limited to, nodular hyperplasias, adenomas, and malignant tumors, including primary carcinoma of the liver and metastatic tumors.

[2718] Examples of cellular proliferative and/or differentiative disorders of the breast include, but are not limited to, proliferative breast disease including, e.g., epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors, e.g., stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[2719] Examples of cellular proliferative and/or differentiative disorders of the lung include, but are not limited to, bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[2720] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin. A hematopoietic neoplastic disorder can arise from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[2721] Disorders of the breast include, but are not limited to, disorders of development; inflammations, including but not limited to, acute mastitis, periductal mastitis, periductal mastitis (recurrent subareolar abscess, squamous metaplasia of lactiferous ducts), mammary duct ectasia, fat necrosis, granulomatous mastitis, and pathologies associated with silicone breast implants; fibrocystic changes; proliferative breast disease including, but not limited to, epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors including, but not limited to, stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, no special type, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[2722] Disorders involving the ovary include, for example, polycystic ovarian disease, Stein-leventhal syndrome, Pseudomyxoma peritonei and stromal hyperthecosis; ovarian tumors such as, tumors of coelomic epithelium, serous tumors, mucinous tumors, endometeriod tumors, clear cell adenocarcinoma, cystadenofibroma, brenner tumor, surface epithelial tumors; germ cell tumors such as mature (benign) teratomas, monodermal teratomas, immature malignant teratomas, dysgerminoma, endodermal sinus tumor, choriocarcinoma; sex cord-stomal tumors such as, granulosa-theca cell tumors, thecoma-fibromas, androblastomas, hill cell tumors, and gonadoblastoma; and metastatic tumors such as Krukenberg tumors.

[2723] Examples of disorders of the lung include, but are not limited to, congenital anomalies; atelectasis; diseases of vascular origin, such as pulmonary congestion and edema, including hemodynamic pulmonary edema and edema caused by microvascular injury, adult respiratory distress syndrome (diffuse alveolar damage), pulmonary embolism, hemorrhage, and infarction, and pulmonary hypertension and vascular sclerosis; chronic obstructive pulmonary disease, such as emphysema, chronic bronchitis, bronchial asthma, and bronchiectasis; diffuse interstitial (infiltrative, restrictive) diseases, such as pneumoconioses, sarcoidosis, idiopathic pulmonary fibrosis, desquamative interstitial pneumonitis, hypersensitivity pneumonitis, pulmonary eosinophilia (pulmonary infiltration with eosinophilia), Bronchiolitis obliterans-organizing pneumonia, diffuse pulmonary hemorrhage syndromes, including Goodpasture syndrome, idiopathic pulmonary hemosiderosis and other hemorrhagic syndromes, pulmonary involvement in collagen vascular disorders, and pulmonary alveolar proteinosis; complications of therapies, such as drug-induced lung disease, radiation-induced lung disease, and lung transplantation; tumors, such as bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[2724] Disorders involving the colon include, but are not limited to, congenital anomalies, such as atresia and stenosis, Meckel diverticulum, congenital aganglionic megacolon-Hirschsprung disease; enterocolitis, such as diarrhea and dysentery, infectious enterocolitis, including viral gastroenteritis, bacterial enterocolitis, necrotizing enterocolitis, antibiotic-associated colitis (pseudomembranous colitis), and collagenous and lymphocytic colitis, miscellaneous intestinal inflammatory disorders, including parasites and protozoa, acquired immunodeficiency syndrome, transplantation, drug-induced intestinal injury, radiation enterocolitis, neutropenic colitis (typhlitis), and diversion colitis; idiopathic inflammatory bowel disease, such as Crohn disease and ulcerative colitis; tumors of the colon, such as non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors.

[2725] The 47508 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:42 thereof are collectively referred to as “polypeptides or proteins of the invention” or “47508 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “47508 nucleic acids.” 47508 molecules refer to 47508 nucleic acids, polypeptides, and antibodies.

[2726] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[2727] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[2728] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[2729] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO:41 or SEQ ID NO:43, corresponds to a naturally-occurring nucleic acid molecule.

[2730] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein.

[2731] As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding a 47508 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 47508 protein or derivative thereof.

[2732] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 47508 protein is at least 10% pure. In a preferred embodiment, the preparation of 47508 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-47508 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-47508 chemicals. When the 47508 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[2733] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 47508 without abolishing or substantially altering a 47508 activity. Preferably the alteration does not substantially alter the 47508 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 47508, results in abolishing a 47508 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 47508 are predicted to be particularly unamenable to alteration.

[2734] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 47508 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 47508 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 47508 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:41 or SEQ ID NO:43, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[2735] As used herein, a “biologically active portion” of a 47508 protein includes a fragment of a 47508 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 47508 molecule and a non-47508 molecule or between a first 47508 molecule and a second 47508 molecule (e.g., a dimerization interaction). Biologically active portions of a 47508 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 47508 protein, e.g., the amino acid sequence shown in SEQ ID NO:42, which include less amino acids than the full length 47508 proteins, and exhibit at least one activity of a 47508 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 47508 protein, e.g., histone deacetylation, e.g., deacetylation of histone H2A, H2B, H3, or H4. A biologically active portion of a 47508 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 47508 protein can be used as targets for developing agents which modulate a 47508 mediated activity, e.g., histone deacetylation, e.g., deacetylation of histone H2A, H2B, H3, or H4.

[2736] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[2737] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[2738] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[2739] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[2740] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[2741] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 47508 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 47508 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[2742] Particularly preferred 47508 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:42. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:42 are termed substantially identical.

[2743] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:41 or 43 are termed substantially identical.

[2744] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[2745] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[2746] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[2747] Various aspects of the invention are described in further detail below.

[2748] Isolated Nucleic Acid Molecules of 47508

[2749] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 47508 polypeptide described herein, e.g., a full-length 47508 protein or a fragment thereof, e.g., a biologically active portion of 47508 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 47508 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[2750] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:41, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 47508 protein (i.e., “the coding region” of SEQ ID NO:41, as shown in SEQ ID NO:43), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:41 (e.g., SEQ ID NO:43) and, e.g., no flanking sequences which normally accompany the subject sequence. In yet another embodiment, the nucleic acid includes one or more nucleotides from 1-234 or 852-862 of SEQ ID NO:41. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein from about amino acid 8 to 392 of SEQ ID NO:42. In other embodiments, the nucleic acid molecule includes a nucleotide sequence encoding one or more of amino acids 1-57 or 263-266 of SEQ ID NO:42.

[2751] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:41 or SEQ ID NO:43, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:41 or SEQ ID NO:43, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:41 or 43, thereby forming a stable duplex.

[2752] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about: 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:41 or SEQ ID NO:43, or a portion, preferably of the same length, of any of these nucleotide sequences.

[2753] 47508 Nucleic Acid Fragments

[2754] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:41 or 43. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of a 47508 protein, e.g., an immunogenic or biologically active portion of a 47508 protein. A fragment can comprise those nucleotides of SEQ ID NO:41 (e.g., one or more of nucleotides 1-234 of SEQ ID NO:41), which encode the N-terminus of 47508, e.g., about amino acid residues 1 to 57 of SEQ ID NO:42, or a portion thereof. Alternatively, a fragment can comprise those nucleotides of SEQ ID NO:41 which encode amino acids 263 to 266 of SEQ ID NO:42 (e.g., nucleotides 852-862 of SEQ ID NO:41). The nucleotide sequence determined from the cloning of the 47508 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 47508 family members, or fragments thereof, as well as 47508 homologues, or fragments thereof, from other species.

[2755] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 100, 200, 220, 240, 250, 275, 300, 320, or more amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention, e.g., SEQ ID NO: 4375 of WO 00/58473, SEQ ID NOS: 2079 and 2462 of WO 01/02568, or sequences having NCBI accession numbers AL137362 or AU079696.

[2756] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 47508 nucleic acid fragment can include a sequence corresponding to a histone deacetylase domain, e.g., about amino acid residues 83 to 392 of SEQ ID NO:42 (or a fragment thereof, e.g., amino acids 83-150, 150-200, 200-250, 250-300, 300-350, or 350-392 of SEQ ID NO:42), and at least one amino acid residue from amino acid residues 1 to 57 of SEQ ID NO:42.

[2757] 47508 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:41 or SEQ ID NO:43, or of a naturally occurring allelic variant or mutant of SEQ ID NO:41 or SEQ ID NO:43.

[2758] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[2759] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes: the 5′UTR of 47508, e.g., about nucleotides 1 to 65 of SEQ ID NO:41; N-terminal portions of human 47508, e.g., about amino acid residues 1 to 57 of SEQ ID NO:42; the histone deacetylase domain of human 47508, e.g., about amino acid residues 83 to 392; catalytically important motifs of the histone deacetylase domain of human 47508, e.g. about amino acid residues 205 to 215, or 240 to 255, or 325 to 335 of SEQ ID NO:42; or other fragments of the histone deacetylase domain of human 47508, e.g., about amino acid residues 260 to 270 of SEQ ID NO:42.

[2760] In another embodiment, a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 47508 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a 5′UTR of human 47508, e.g., about nucleotides 1 to 65 of SEQ ID NO:41; a region which encodes an N-terminal portion of human 47508, e.g., about amino acids 1 to 57 of SEQ ID NO:42; a region which encodes a histone deacetylase domain and at least one amino acid from about amino acid residues 1 to 57 of SEQ ID NO:42, e.g., from about amino acid 57 to 392 of SEQ ID NO:42; a region which encodes a catalytically important fragment of the histone deacetylase domain of human 47508, e.g., about amino acid residues 205 to 215, 240 to 255, or 325 to 335; a region which encodes other portions of the histone deacetylase domain of human 47508, e.g., about amino acid residues 260 to 270 of SEQ ID NO:42.

[2761] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[2762] A nucleic acid fragment encoding a “biologically active portion of a 47508 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:41 or 43, which encodes a polypeptide having a 47508 biological activity (e.g., the biological activities of the 47508 proteins are described herein), expressing the encoded portion of the 47508 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 47508 protein. For example, a nucleic acid fragment encoding a biologically active portion of 47508 includes a histone deacetylase domain; at least one amino acid from about amino acid residues 1 to 57 of SEQ ID NO:42, e.g., from about amino acid 57 to 392 of SEQ ID NO:42. A nucleic acid fragment encoding a biologically active portion of a 47508 polypeptide, may comprise a nucleotide sequence which is greater than 300, 350, 400, 450, 500, 550, 600, 650, or more nucleotides in length.

[2763] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1550, or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:41, or SEQ ID NO:43. In a preferred embodiment, a nucleic acid of human 47508 includes at least one contiguous nucleotide from the region of about nucleotides 1-65, 1-234, 65-234, 65-311, 234-500, 312-700, 312-1307, 600-850, 750-850, 750-1000, 852-862, 900-1182, 1000-1100, 1150-1307, 1183-1307, 1183-1579, 1308-1579.

[2764] 47508 Nucleic Acid Variants

[2765] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:41 or SEQ ID NO:43. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 47508 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:42. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[2766] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[2767] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[2768] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:41 or 43, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[2769] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90%, 95%, 98%, 99%, or more identical to the nucleotide sequence shown in SEQ ID NO:42 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO:42 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 47508 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 47508 gene.

[2770] Preferred variants include those that are correlated with histone deacetylase acitivity, e.g., deacetylation of histones H2A, H2B, H3, or H4.

[2771] Allelic variants of 47508, e.g., human 47508, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 47508 protein within a population that maintain the ability to deacetylate histones, e.g., histones H2A, H2B, H3, or H4, or to interact with transcription factors, e.g., transcriptional repressors. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:42, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 47508, e.g., human 47508, protein within a population that do not have the ability to deacetylate histones, e.g., histones H2A, H2B, H3, or H4, or to interact with transcription factors, e.g., transcriptional repressors. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:42, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[2772] Moreover, nucleic acid molecules encoding other 47508 family members and, thus, which have a nucleotide sequence which differs from the 47508 sequences of SEQ ID NO:41 or SEQ ID NO:43 are intended to be within the scope of the invention.

[2773] Antisense Nucleic Acid Molecules, Ribozymes and Modified 47508 Nucleic Acid Molecules

[2774] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 47508. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 47508 coding strand, or to only a portion thereof (e.g., the coding region of human 47508 corresponding to SEQ ID NO:43). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 47508 (e.g., the 5′ and 3′untranslated regions).

[2775] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 47508 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 47508 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 47508 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[2776] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[2777] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 47508 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[2778] In yet another embodiment, the antisense nucleic acid molecule of the invention is an α-anomeric nucleic acid molecule. An α-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual β-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[2779] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 47508-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 47508 cDNA disclosed herein (i.e., SEQ ID NO:41 or SEQ ID NO:43), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 47508-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 47508 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[2780] 47508 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 47508 (e.g., the 47508 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 47508 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[2781] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or calorimetric.

[2782] A 47508 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[2783] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[2784] PNAs of 47508 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 47508 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[2785] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[2786] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 47508 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 47508 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[2787] Isolated 47508 Polypeptides

[2788] In another aspect, the invention features, an isolated 47508 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-47508 antibodies. 47508 protein can be isolated from cells or tissue sources using standard protein purification techniques. 47508 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[2789] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[2790] In a preferred embodiment, a 47508 polypeptide has one or more of the following characteristics:

[2791] (i) it has the ability to deacetylate histones, e.g., histones H2A, H2B, H3, or H4;

[2792] (ii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of a 47508 polypeptide, e.g., a polypeptide of SEQ ID NO:42;

[2793] (iii) it has an overall sequence similarity of at least 60%, more preferably at least 70%, 80%, 90%, 95%, 98%, 99%, or more with a polypeptide a of SEQ ID NO:42;

[2794] (iv) it can be found in the nucleus of a cell;

[2795] (v) it has a histone deacetylase domain which is preferably about 70%, 80%, 90%, 95%, 98%, 99%, or more homologous to amino acid residues about 83 to 392 of SEQ ID NO:42;

[2796] (vi) it has at least one histone deacetylase zinc-binding triad;

[2797] (vii) it has at least two charge-relay systems;

[2798] (viiii) it can colocalize with transcription factors, e.g., transcriptional repressors;

[2799] (ix) it has at least one, two, three, preferably four predicted Protein kinase C phosphorylation sites (PS00005);

[2800] (x) it has at least one, two, three, four, preferably five predicted Casein kinase II phosphorylation sites (PS00006);

[2801] (xi) it has at least one, two, three, four, five, six, preferably seven predicted N-myristoylation sites (PS00008); or

[2802] (xii) it has at least one predicted amidation site (PS00009).

[2803] In a preferred embodiment the 47508 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:42. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:42 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:42. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the histone deacetylase domain, e.g., about amino acid residues 83 to 392 of SEQ ID NO:42. In another preferred embodiment one or more differences are in the histone deacetylase domain, e.g., about amino acids 83 to 392 of SEQ ID NO:42.

[2804] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 47508 proteins differ in amino acid sequence from SEQ ID NO:42, yet retain biological activity.

[2805] In one embodiment, the protein includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or more homologous to SEQ ID NO:42.

[2806] A 47508 protein or fragment is provided which varies from the sequence of SEQ ID NO:42 in regions defined by amino acids about 83 to 392 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:42 in regions defined by amino acids about 263 to 266. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[2807] In one embodiment, a biologically active portion of a 47508 protein includes a histone deacetylase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 47508 protein.

[2808] In a preferred embodiment, the 47508 protein has an amino acid sequence shown in SEQ ID NO:42. In other embodiments, the 47508 protein is substantially identical to SEQ ID NO:42. In another embodiment, the 47508 protein is substantially identical to SEQ ID NO:42 and retains the functional activity of the protein of SEQ ID NO:42, as described in detail in the subsections above. In yet another embodiment, the 47508 protein or fragment thereof includes at least one amino acid from amino acid residues 1 to 57, or 263 to 266 of SEQ ID NO:42.

[2809] 47508 Chimeric or Fusion Proteins

[2810] In another aspect, the invention provides 47508 chimeric or fusion proteins. As used herein, a 47508 “chimeric protein” or “fusion protein” includes a 47508 polypeptide linked to a non-47508 polypeptide. A “non-47508 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 47508 protein, e.g., a protein which is different from the 47508 protein and which is derived from the same or a different organism. The 47508 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 47508 amino acid sequence. In a preferred embodiment, a 47508 fusion protein includes at least one (or two) biologically active portion of a 47508 protein. The non-47508 polypeptide can be fused to the N-terminus or C-terminus of the 47508 polypeptide.

[2811] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-47508 fusion protein in which the 47508 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 47508. Alternatively, the fusion protein can be a 47508 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 47508 can be increased through use of a heterologous signal sequence.

[2812] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[2813] The 47508 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 47508 fusion proteins can be used to affect the bioavailability of a 47508 substrate. 47508 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 47508 protein; (ii) mis-regulation of the 47508 gene; and (iii) aberrant post-translational modification of a 47508 protein.

[2814] Moreover, the 47508-fusion proteins of the invention can be used as immunogens to produce anti-47508 antibodies in a subject, to purify 47508 ligands and in screening assays to identify molecules which inhibit the interaction of 47508 with a 47508 substrate.

[2815] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 47508-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 47508 protein.

[2816] Variants of 47508 Proteins

[2817] In another aspect, the invention also features a variant of a 47508 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 47508 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 47508 protein. An agonist of the 47508 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 47508 protein. An antagonist of a 47508 protein can inhibit one or more of the activities of the naturally occurring form of the 47508 protein by, for example, competitively modulating a 47508-mediated activity of a 47508 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 47508 protein.

[2818] Variants of a 47508 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 47508 protein for agonist or antagonist activity.

[2819] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 47508 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 47508 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[2820] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 47508 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 47508 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[2821] Cell based assays can be exploited to analyze a variegated 47508 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 47508 in a substrate-dependent manner. The transfected cells are then contacted with 47508 and the effect of the expression of the mutant on signaling by the 47508 substrate can be detected, e.g., by measuring histone deacetylase activity. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 47508 substrate, and the individual clones further characterized.

[2822] In another aspect, the invention features a method of making a 47508 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 47508 polypeptide, e.g., a naturally occurring 47508 polypeptide. The method includes: altering the sequence of a 47508 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[2823] In another aspect, the invention features a method of making a fragment or analog of a 47508 polypeptide a biological activity of a naturally occurring 47508 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 47508 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[2824] Anti-47508 Antibodies

[2825] In another aspect, the invention provides an anti-47508 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more-conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[2826] The anti-47508 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[2827] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[2828] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 47508 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-47508 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)₂ fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[2829] The anti-47508 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[2830] Phage display and combinatorial methods for generating anti-47508 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[2831] In one embodiment, the anti-47508 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[2832] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[2833] An anti-47508 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[2834] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[2835] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 47508 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[2836] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[2837] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 47508 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[2838] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[2839] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[2840] In preferred embodiments an antibody can be made by immunizing with purified 47508 antigen, or a fragment thereof, e.g., a fragment described herein, membrane associated antigen, tissue, e.g., crude tissue preparations, whole cells, preferably living cells, lysed cells, or cell fractions, e.g., nuclear cytosol.

[2841] A full-length 47508 protein or, antigenic peptide fragment of 47508 can be used as an immunogen or can be used to identify anti-47508 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 47508 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:42 and encompasses an epitope of 47508. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[2842] Fragments of 47508 can be used as immunogens or used to characterize the specificity of an antibody. For example, fragments of 47508 which include residues about 19 to 35, about 245 to 263, or about 286 to 310 can be used to make, e.g., antibodies against hydrophilic regions of the 47508 protein. Similarly, fragments of 47508 which include residues about 151 to 172, about 215 to 232, or about 377 to 393 can be used to make an antibody against a hydrophobic region of the 47508 protein; fragments of 47508 which include residues about 1 to 57, or about 50 to 82 can be used to make an antibody against an N-terminal portion of the 47508 protein; and fragments of 47508 which include residues about 90 to 110, about 200 to 220, about 240 to 260, about 320 to 340, or about 360 to 375 can be used to make an antibody against the histone deacetylase region of the 47508 protein.

[2843] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[2844] Antibodies which bind only native 47508 protein, only denatured or otherwise non-native 47508 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 47508 protein.

[2845] Preferred epitopes encompassed by the antigenic peptide are regions of 47508 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 47508 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 47508 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[2846] In preferred embodiments antibodies can bind one or more of purified antigen, membrane associated antigen, tissue, e.g., tissue sections, whole cells, preferably living cells, lysed cells, cell fractions, e.g., nuclear cytoplasm.

[2847] The anti-47508 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 47508 protein.

[2848] In a preferred embodiment the antibody has: effector function; and can fix complement. In other embodiments the antibody does not; recruit effector cells; or fix complement.

[2849] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example., it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[2850] In a preferred embodiment, an anti-47508 antibody alters (e.g., increases or decreases) the histone deacetylase activity of a 47508 polypeptide. For example, the antibody can bind at or in proximity to the active site, e.g., to an epitope that includes a residue located from about 90 to 110, about 200 to 220, about 240 to 260, about 320 to 340, or about 360 to 375 of SEQ ID NO:42.

[2851] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[2852] An anti-47508 antibody (e.g., monoclonal antibody) can be used to isolate 47508 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-47508 antibody can be used to detect 47508 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-47508 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or ³H.

[2853] The invention also includes a nucleic acids which encodes an anti-47508 antibody, e.g., an anti-47508 antibody described herein. Also included are vectors which include the nucleic acid and sells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[2854] The invention also includes cell lines, e.g., hybridomas, which make an anti-47508 antibody, e.g., and antibody described herein, and method of using said cells to make a 47508 antibody.

[2855] 47508 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[2856] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication-or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[2857] A vector can include a 47508 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 47508 proteins, mutant forms of 47508 proteins, fusion proteins, and the like).

[2858] The recombinant expression vectors of the invention can be designed for expression of 47508 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[2859] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[2860] Purified fusion proteins can be used in 47508 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 47508 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[2861] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[2862] The 47508 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[2863] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[2864] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[2865] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the α-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[2866] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[2867] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 47508 nucleic acid molecule within a recombinant expression vector or a 47508 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[2868] A host cell can be any prokaryotic or eukaryotic cell. For example, a 47508 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells (African green monkey kidney cells CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182)). Other suitable host cells are known to those skilled in the art.

[2869] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[2870] A host cell of the invention can be used to produce (i.e., express) a 47508 protein. Accordingly, the invention further provides methods for producing a 47508 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 47508 protein has been introduced) in a suitable medium such that a 47508 protein is produced. In another embodiment, the method further includes isolating a 47508 protein from the medium or the host cell.

[2871] In another aspect, the invention features, a cell or purified preparation of cells which include a 47508 transgene, or which otherwise misexpress 47508. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 47508 transgene, e.g., a heterologous form of a 47508, e.g., a gene derived from humans (in the case of a non-human cell). The 47508 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 47508, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 47508 alleles or for use in drug screening.

[2872] In another aspect, the invention features, a human cell, e.g., a hematopoietic or hepatic stem cell, transformed with nucleic acid which encodes a subject 47508 polypeptide.

[2873] Also provided are cells, preferably human cells, e.g., human hematopoietic, hepatic, or fibroblast cells, in which an endogenous 47508 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 47508 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 47508 gene. For example, an endogenous 47508 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[2874] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 47508 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 47508 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 47508 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[2875] 47508 Transgenic Animals

[2876] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 47508 protein and for identifying and/or evaluating modulators of 47508 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 47508 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[2877] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 47508 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 47508 transgene in its genome and/or expression of 47508 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 47508 protein can further be bred to other transgenic animals carrying other transgenes.

[2878] 47508 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[2879] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[2880] Uses of 47508

[2881] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[2882] The isolated nucleic acid molecules of the invention can be used, for example, to express a 47508 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 47508 mRNA (e.g., in a biological sample) or a genetic alteration in a 47508 gene, and to modulate 47508 activity, as described further below. The 47508 proteins can be used to treat disorders characterized by insufficient or excessive production of a 47508 substrate or production of 47508 inhibitors. In addition, the 47508 proteins can be used to screen for naturally occurring 47508 substrates, to screen for drugs or compounds which modulate 47508 activity, as well as to treat disorders characterized by insufficient or excessive production of 47508 protein or production of 47508 protein forms which have decreased, aberrant or unwanted activity compared to 47508 wild type protein (e.g., cellular proliferative and/or differentiative disorder). Moreover, the anti-47508 antibodies of the invention can be used to detect and isolate 47508 proteins, regulate the bioavailability of 47508 proteins, and modulate 47508 activity.

[2883] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 47508 polypeptide is provided. The method includes: contacting the compound with the subject 47508 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 47508 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 47508 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 47508 polypeptide. Screening methods are discussed in more detail below.

[2884] 47508 Screening Assays

[2885] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 47508 proteins, have a stimulatory or inhibitory effect on, for example, 47508 expression or 47508 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 47508 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 47508 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[2886] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of a 47508 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 47508 protein or polypeptide or a biologically active portion thereof.

[2887] In one embodiment, an activity of a 47508 protein can be assayed by expression a 47508 nucleic acid in a cell, e.g., a mammalian cell, such that 47508 protein is produced, purifying the 47508 protein, e.g., by means of an affinity tag, e.g., a HIS6 tag, mixing the purified 47508 protein with histone proteins, e.g., histones H2A, H2B, H3, and H4, which are acetylated with H3-labeled acetyl groups, and monitoring cleavage of the labeled acetyl groups from the histone proteins over time. An example of this method is shown is Hu et al. (2000), J Biol Chem 275(20):15254-64, the contents of which are incorporated herein by reference.

[2888] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[2889] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[2890] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[2891] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 47508 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 47508 activity is determined. Determining the ability of the test compound to modulate 47508 activity can be accomplished by monitoring, for example, histone deacetylation, e.g., deacetylation of histone H2A, H2B, H3, or H4. The cell, for example, can be of mammalian origin, e.g., human.

[2892] The ability of the test compound to modulate 47508 binding to a compound, e.g., a 47508 substrate, or to bind to 47508 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 47508 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 47508 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 47508 binding to a 47508 substrate in a complex. For example, compounds (e.g., 47508 substrates) can be labeled with ¹²⁵I, ³⁵S, ¹⁴C, or ³H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[2893] The ability of a compound (e.g., a 47508 substrate) to interact with 47508 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 47508 without the labeling of either the compound or the 47508. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 47508.

[2894] In yet another embodiment, a cell-free assay is provided in which a 47508 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 47508 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 47508 proteins to be used in assays of the present invention include fragments which participate in interactions with non-47508 molecules, e.g., fragments with high surface probability scores.

[2895] Soluble and/or membrane-bound forms of isolated proteins (e.g., 47508 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)_(n), 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[2896] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[2897] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[2898] In another embodiment, determining the ability of the 47508 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[2899] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[2900] It may be desirable to immobilize either 47508, an anti-47508 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 47508 protein, or interaction of a 47508 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/47508 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 47508 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 47508 binding or activity determined using standard techniques.

[2901] Other techniques for immobilizing either a 47508 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 47508 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[2902] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[2903] In one embodiment, this assay is performed utilizing antibodies reactive with 47508 protein or target molecules but which do not interfere with binding of the 47508 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 47508 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 47508 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 47508 protein or target molecule.

[2904] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11: 141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[2905] In a preferred embodiment, the assay includes contacting the 47508 protein or biologically active portion thereof with a known compound which binds 47508 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 47508 protein, wherein determining the ability of the test compound to interact with a 47508 protein includes determining the ability of the test compound to preferentially bind to 47508 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[2906] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 47508 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 47508 protein through modulation of the activity of a downstream effector of a 47508 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[2907] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[2908] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[2909] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[2910] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[2911] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[2912] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[2913] In yet another aspect, the 47508 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 47508 (“47508-binding proteins” or “47508-bp”) and are involved in 47508 activity. Such 47508-bps can be activators or inhibitors of signals by the 47508 proteins or 47508 targets as, for example, downstream elements of a 47508-mediated signaling pathway.

[2914] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 47508 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 47508 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 47508-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 47508 protein.

[2915] In another embodiment, modulators of 47508 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 47508 mRNA or protein evaluated relative to the level of expression of 47508 mRNA or protein in the absence of the candidate compound. When expression of 47508 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 47508 mRNA or protein expression. Alternatively, when expression of 47508 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 47508 mRNA or protein expression. The level of 47508 mRNA or protein expression can be determined by methods described herein for detecting 47508 mRNA or protein.

[2916] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 47508 protein can be confirmed in vivo, e.g., in an animal such as an animal model for a cellular proliferative and/or differentiative disorder.

[2917] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 47508 modulating agent, an antisense 47508 nucleic acid molecule, a 47508-specific antibody, or a 47508-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[2918] 47508 Detection Assays

[2919] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 47508 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[2920] 47508 Chromosome Mapping

[2921] The 47508 nucleotide sequences or portions thereof can be used to map the location of the 47508 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 47508 sequences with genes associated with disease.

[2922] Briefly, 47508 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 47508 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 47508 sequences will yield an amplified fragment.

[2923] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[2924] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 47508 to a chromosomal location.

[2925] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[2926] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[2927] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[2928] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 47508 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[2929] 47508 Tissue Typing

[2930] 47508 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[2931] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 47508 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[2932] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:41 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:43 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[2933] If a panel of reagents from 47508 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[2934] Use of Partial 47508 Sequences in Forensic Biology

[2935] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[2936] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:41 (e.g., fragments derived from the noncoding regions of SEQ ID NO:41 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[2937] The 47508 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 47508 probes can be used to identify tissue by species and/or by organ type.

[2938] In a similar fashion, these reagents, e.g., 47508 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[2939] Predictive Medicine of 47508

[2940] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[2941] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 47508.

[2942] Such disorders include, e.g., a disorder associated with the misexpression of 47508 gene, e.g., a cellular proliferative and/or differentiative disorder.

[2943] The method includes one or more of the following:

[2944] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 47508 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[2945] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 47508 gene;

[2946] detecting, in a tissue of the subject, the misexpression of the 47508 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[2947] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 47508 polypeptide.

[2948] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 47508 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[2949] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:41, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 47508 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[2950] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 47508 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 47508.

[2951] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[2952] In preferred embodiments the method includes determining the structure of a 47508 gene, an abnormal structure being indicative of risk for the disorder.

[2953] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 47508 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[2954] Diagnostic and Prognostic Assays of 47508

[2955] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 47508 molecules and for identifying variations and mutations in the sequence of 47508 molecules.

[2956] Expression Monitoring and Profiling:

[2957] The presence, level, or absence of 47508 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 47508 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 47508 protein such that the presence of 47508 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 47508 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 47508 genes; measuring the amount of protein encoded by the 47508 genes; or measuring the activity of the protein encoded by the 47508 genes.

[2958] The level of mRNA corresponding to the 47508 gene in a cell can be determined both by in situ and by in vitro formats.

[2959] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 47508 nucleic acid, such as the nucleic acid of SEQ ID NO:41, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 47508 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[2960] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 47508 genes.

[2961] The level of mRNA in a sample that is encoded by one of 47508 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[2962] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 47508 gene being analyzed.

[2963] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 47508 mRNA, or genomic DNA, and comparing the presence of 47508 mRNA or genomic DNA in the control sample with the presence of 47508 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 47508 transcript levels.

[2964] A variety of methods can be used to determine the level of protein encoded by 47508. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)₂) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[2965] The detection methods can be used to detect 47508 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 47508 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 47508 protein include introducing into a subject a labeled anti-47508 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-47508 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[2966] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 47508 protein, and comparing the presence of 47508 protein in the control sample with the presence of 47508 protein in the test sample.

[2967] The invention also includes kits for detecting the presence of 47508 in a biological sample. For example, the kit can include a compound or agent capable of detecting 47508 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 47508 protein or nucleic acid.

[2968] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[2969] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[2970] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 47508 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as a cellular proliferative and/or differentiative disorder.

[2971] In one embodiment, a disease or disorder associated with aberrant or unwanted 47508 expression or activity is identified. A test sample is obtained from a subject and 47508 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 47508 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 47508 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[2972] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 47508 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for, e.g., a cellular proliferative and/or differentiative disorder.

[2973] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 47508 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 47508 (e.g., other genes associated with a 47508-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[2974] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 47508 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a cellular proliferative and/or differentiative disorder in a subject wherein an increase in 47508 expression is an indication that the subject has or is disposed to having a cellular proliferative and/or differentiative disorder. The method can be used to monitor a treatment for a cellular proliferative and/or differentiative disorder in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[2975] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 47508 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[2976] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 47508 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[2977] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[2978] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 47508 expression.

[2979] 47508 Arrays and Uses Thereof

[2980] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 47508 molecule (e.g., a 47508 nucleic acid or a 47508 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm², and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[2981] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 47508 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 47508. Each address of the subset can include a capture probe that hybridizes to a different region of a 47508 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 47508 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 47508 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 47508 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[2982] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[2983] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 47508 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 47508 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-47508 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[2984] In another aspect, the invention features a method of analyzing the expression of 47508. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 47508-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[2985] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 47508. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 47508. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[2986] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 47508 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[2987] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[2988] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 47508-associated disease or disorder; and processes, such as a cellular transformation associated with a 47508-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 47508-associated disease or disorder

[2989] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 47508) that could serve as a molecular target for diagnosis or therapeutic intervention.

[2990] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 47508 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 47508 polypeptide or fragment thereof. For example, multiple variants of a 47508 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[2991] The polypeptide array can be used to detect a 47508 binding compound, e.g., an antibody in a sample from a subject with specificity for a 47508 polypeptide or the presence of a 47508-binding protein or ligand.

[2992] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 47508 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[2993] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 47508 or from a cell or subject in which a 47508 mediated response has been elicited, e.g., by contact of the cell with 47508 nucleic acid or protein, or administration to the cell or subject 47508 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 47508 (or does not express as highly as in the case of the 47508 positive plurality of capture probes) or from a cell or subject which in which a 47508 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 47508 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[2994] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 47508 or from a cell or subject in which a 47508-mediated response has been elicited, e.g., by contact of the cell with 47508 nucleic acid or protein, or administration to the cell or subject 47508 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 47508 (or does not express as highly as in the case of the 47508 positive plurality of capture probes) or from a cell or subject which in which a 47508 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[2995] In another aspect, the invention features a method of analyzing 47508, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 47508 nucleic acid or amino acid sequence; comparing the 47508 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 47508.

[2996] Detection of 47508 Variations or Mutations

[2997] The methods of the invention can also be used to detect genetic alterations in a 47508 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 47508 protein activity or nucleic acid expression, such as a cellular proliferative and/or differentiative disorder. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 47508-protein, or the mis-expression of the 47508 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 47508 gene; 2) an addition of one or more nucleotides to a 47508 gene; 3) a substitution of one or more nucleotides of a 47508 gene, 4) a chromosomal rearrangement of a 47508 gene; 5) an alteration in the level of a messenger RNA transcript of a 47508 gene, 6) aberrant modification of a 47508 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 47508 gene, 8) a non-wild type level of a 47508-protein, 9) allelic loss of a 47508 gene, and 10) inappropriate post-translational modification of a 47508-protein.

[2998] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 47508-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 47508 gene under conditions such that hybridization and amplification of the 47508-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[2999] In another embodiment, mutations in a 47508 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[3000] In other embodiments, genetic mutations in 47508 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 47508 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 47508 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 47508 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[3001] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 47508 gene and detect mutations by comparing the sequence of the sample 47508 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[3002] Other methods for detecting mutations in the 47508 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[3003] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 47508 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[3004] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 47508 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 47508 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[3005] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[3006] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[3007] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[3008] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 47508 nucleic acid.

[3009] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:41 or the complement of SEQ ID NO:41. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[3010] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 47508. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[3011] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the T_(m) of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[3012] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 47508 nucleic acid.

[3013] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 47508 gene.

[3014] Use of 47508 Molecules as Surrogate Markers

[3015] The 47508 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 47508 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 47508 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[3016] The 47508 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 47508 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-47508 antibodies may be employed in an immune-based detection system for a 47508 protein marker, or 47508-specific radiolabeled probes may be used to detect a 47508 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[3017] The 47508 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 47508 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 47508 DNA may correlate 47508 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[3018] Pharmaceutical Compositions of 47508

[3019] The nucleic acid and polypeptides, fragments thereof, as well as anti-47508 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[3020] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[3021] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[3022] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[3023] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[3024] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[3025] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[3026] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[3027] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[3028] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[3029] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[3030] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[3031] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[3032] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[3033] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[3034] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[3035] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, CC-1065, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine, vinblastine, taxol and maytansinoids). Radioactive ions include, but are not limited to iodine, yttrium and praseodymium.

[3036] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, α-interferon, β-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[3037] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[3038] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[3039] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[3040] Methods of Treatment for 47508

[3041] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 47508 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[3042] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 47508 molecules of the present invention or 47508 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[3043] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 47508 expression or activity, by administering to the subject a 47508 or an agent which modulates 47508 expression or at least one 47508 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 47508 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 47508 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 47508 aberrance, for example, a 47508, 47508 agonist or 47508 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[3044] It is possible that some 47508 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[3045] The 47508 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, disorders of the breast, ovary, lung, colon, and liver, as well cardiovascular disorders, as has been described above. In addition, the 47508 molecules of the invention can act as novel diagnostic targets and therapeutic agents for controlling disorders associated with bone metabolism, immune disorders, viral diseases, and pain or metabolic disorders.

[3046] Aberrant expression and/or activity of 47508 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 47508 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 47508 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 47508 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[3047] The 47508 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[3048] Additionally, 47508 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 47508 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 47508 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[3049] Additionally, 47508 may play an important role in the regulation of metabolism or pain disorders. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[3050] Additional disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[3051] As used herein, disorders involving the heart, or “cardiovascular disease” or a “cardiovascular disorder” includes a disease or disorder which affects the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. A cardiovascular disorder includes, but is not limited to disorders such as arteriosclerosis, atherosclerosis, cardiac hypertrophy, ischemia reperfusion injury, restenosis, arterial inflammation, vascular wall remodeling, ventricular remodeling, rapid ventricular pacing, coronary microembolism, tachycardia, bradycardia, pressure overload, aortic bending, coronary artery ligation, vascular heart disease, valvular disease, including but not limited to, valvular degeneration caused by calcification, rheumatic heart disease, endocarditis, or complications of artificial valves; atrial fibrillation, long-QT syndrome, congestive heart failure, sinus node dysfunction, angina, heart failure, hypertension, atrial fibrillation, atrial flutter, pericardial disease, including but not limited to, pericardial effusion and pericarditis; cardiomyopathies, e.g., dilated cardiomyopathy or idiopathic cardiomyopathy, myocardial infarction, coronary artery disease, coronary artery spasm, ischemic disease, arrhythmia, sudden cardiac death, and cardiovascular developmental disorders (e.g., arteriovenous malformations, arteriovenous fistulae, raynaud's syndrome, neurogenic thoracic outlet syndrome, causalgia/reflex sympathetic dystrophy, hemangioma, aneurysm, cavernous angioma, aortic valve stenosis, atrial septal defects, atrioventricular canal, coarctation of the aorta, ebsteins anomaly, hypoplastic left heart syndrome, interruption of the aortic arch, mitral valve prolapse, ductus arteriosus, patent foramen ovale, partial anomalous pulmonary venous return, pulmonary atresia with ventricular septal defect, pulmonary atresia without ventricular septal defect, persistance of the fetal circulation, pulmonary valve stenosis, single ventricle, total anomalous pulmonary venous return, transposition of the great vessels, tricuspid atresia, truncus arteriosus, ventricular septal defects). A cardiovasular disease or disorder also can include an endothelial cell disorder.

[3052] As used herein, an “endothelial cell disorder” includes a disorder characterized by aberrant, unregulated, or unwanted endothelial cell activity, e.g., proliferation, migration, angiogenesis, or vascularization; or aberrant expression of cell surface adhesion molecules or genes associated with angiogenesis, e.g., TIE-2, FLT and FLK. Endothelial cell disorders include tumorigenesis, tumor metastasis, psoriasis, diabetic retinopathy, endometriosis, Grave's disease, ischemic disease (e.g., atherosclerosis), and chronic inflammatory diseases (e.g., rheumatoid arthritis).

[3053] As discussed, successful treatment of 47508 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 47508 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)₂ and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[3054] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[3055] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[3056] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 47508 expression is through the use of aptamer molecules specific for 47508 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 47508 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[3057] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 47508 disorders. For a description of antibodies, see the Antibody section above.

[3058] In circumstances wherein injection of an animal or a human subject with a 47508 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 47508 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 47508 protein. Vaccines directed to a disease characterized by 47508 expression may also be generated in this fashion.

[3059] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[3060] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 47508 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[3061] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED₅₀ with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC₅₀ i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography.

[3062] Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 47508 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 47508 can be readily monitored and used in calculations of IC₅₀.

[3063] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC₅₀. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[3064] Another aspect of the invention pertains to methods of modulating 47508 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 47508 or agent that modulates one or more of the activities of 47508 protein activity associated with the cell. An agent that modulates 47508 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 47508 protein (e.g., a 47508 substrate or receptor), a 47508 antibody, a 47508 agonist or antagonist, a peptidomimetic of a 47508 agonist or antagonist, or other small molecule.

[3065] In one embodiment, the agent stimulates one or 47508 activities. Examples of such stimulatory agents include active 47508 protein and a nucleic acid molecule encoding 47508. In another embodiment, the agent inhibits one or more 47508 activities. Examples of such inhibitory agents include antisense 47508 nucleic acid molecules, anti-47508 antibodies, and 47508 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 47508 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 47508 expression or activity. In another embodiment, the method involves administering a 47508 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 47508 expression or activity.

[3066] Stimulation of 47508 activity is desirable in situations in which 47508 is abnormally downregulated and/or in which increased 47508 activity is likely to have a beneficial effect. For example, stimulation of 47508 activity is desirable in situations in which a 47508 is downregulated and/or in which increased 47508 activity is likely to have a beneficial effect. Likewise, inhibition of 47508 activity is desirable in situations in which 47508 is abnormally upregulated and/or in which decreased 47508 activity is likely to have a beneficial effect.

[3067] 47508 Pharmacogenomics

[3068] The 47508 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 47508 activity (e.g., 47508 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 47508 associated disorders (e.g., cellular proliferative and/or differentiative disorders) associated with aberrant or unwanted 47508 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 47508 molecule or 47508 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 47508 molecule or 47508 modulator.

[3069] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[3070] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[3071] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 47508 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[3072] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 47508 molecule or 47508 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[3073] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 47508 molecule or 47508 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[3074] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 47508 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 47508 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[3075] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 47508 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 47508 gene expression, protein levels, or upregulate 47508 activity, can be monitored in clinical trials of subjects exhibiting decreased 47508 gene expression, protein levels, or downregulated 47508 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 47508 gene expression, protein levels, or downregulate 47508 activity, can be monitored in clinical trials of subjects exhibiting increased 47508 gene expression, protein levels, or upregulated 47508 activity. In such clinical trials, the expression or activity of a 47508 gene, and preferably, other genes that have been implicated in, for example, a 47508-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[3076] 47508 Informatics

[3077] The sequence of a 47508 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 47508. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 47508 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[3078] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[3079] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[3080] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[3081] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[3082] Thus, in one aspect, the invention features a method of analyzing 47508, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 47508 nucleic acid or amino acid sequence; comparing the 47508 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 47508. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[3083] The method can include evaluating the sequence identity between a 47508 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[3084] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[3085] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[3086] Thus, the invention features a method of making a computer readable record of a sequence of a 47508 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[3087] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 47508 sequence, or record, in machine-readable form; comparing a second sequence to the 47508 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 47508 sequence includes a sequence being compared. In a preferred embodiment the 47508 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 47508 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[3088] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 47508-associated disease or disorder or a pre-disposition to a 47508-associated disease or disorder, wherein the method comprises the steps of determining 47508 sequence information associated with the subject and based on the 47508 sequence information, determining whether the subject has a 47508-associated disease or disorder or a pre-disposition to a 47508-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[3089] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 47508-associated disease or disorder or a pre-disposition to a disease associated with a 47508 wherein the method comprises the steps of determining 47508 sequence information associated with the subject, and based on the 47508 sequence information, determining whether the subject has a 47508-associated disease or disorder or a pre-disposition to a 47508-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 47508 sequence of the subject to the 47508 sequences in the database to thereby determine whether the subject as a 47508-associated disease or disorder, or a pre-disposition for such.

[3090] The present invention also provides in a network, a method for determining whether a subject has a 47508 associated disease or disorder or a pre-disposition to a 47508-associated disease or disorder associated with 47508, said method comprising the steps of receiving 47508 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 47508 and/or corresponding to a 47508-associated disease or disorder (e.g., cellular proliferative and/or differentiative disorders), and based on one or more of the phenotypic information, the 47508 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 47508-associated disease or disorder or a pre-disposition to a 47508-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[3091] The present invention also provides a method for determining whether a subject has a 47508-associated disease or disorder or a pre-disposition to a 47508-associated disease or disorder, said method comprising the steps of receiving information related to 47508 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 47508 and/or related to a 47508-associated disease or disorder, and based on one or more of the phenotypic information, the 47508 information, and the acquired information, determining whether the subject has a 47508-associated disease or disorder or a pre-disposition to a 47508-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[3092] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 56939 Invention

[3093] Hydrolases are a large class of enzymes which catalyze the cleavage of a bond with the addition of water. Hydrolases play important roles in the synthesis and breakdown of nearly all major metabolic intermediates, including polypeptides, nucleic acids, and lipids. In particular, the α/β hydrolase family of enzymes is a phylogenetically diverse group of enzymes that have a common fold, typically comprising an eight-stranded α-sheet surrounded by α-helices (Ollis, D. et al. (1992) Protein Eng 5:197-211; Nardini and Dikkstra (1999) Curr Opin Str Bio 9:732-737). Members of the α/β hydrolase family are found in nearly all organisms, from microbes to plants to humans. Members of the hydrolase family of enzymes include enzymes that hydrolyze ester bonds (e.g., phosphatases, sulfatases, exonucleases, and endonucleases), glycosidases, enzymes that act on ether bonds, peptidases (e.g., exopeptidases and endopeptidases), as well as enzymes that hydrolyze carbon-nitrogen bonds, acid anhydrides, carbon-carbon bonds, halide bonds, phosphorous-nitrogen bonds, sulfur-nitrogen bonds, carbon-phosphorous bonds, and sulfur-sulfur bonds (E. C. Webb ed., Enzyme Nomenclature, pp. 306-450, © (1992 Academic Press, Inc. San Diego, Calif.).

[3094] α/β hydrolases vary widely in primary sequence, substrate specificity, and physical properties. However, despite the lack of sequence homology, hydrolase family members display structural similarities, e.g., conservation of a catalytic site framework. In particular, one conserved feature of the alpha/beta hydrolase fold is a nucleophile-histidine-acid catalytic triad. The identities of the triad residues in alpha/beta hydrolase fold enzymes are quite variable in that serine, aspartate, and cysteine have all been identified as catalytic nucleophiles (Schrag, J. et al. (1997) Meth. Enzymol. 284:85-107).

[3095] One particular class of α/β hydrolases are the acyl-CoA thioesterase enzymes, which catalyze the hydrolysis of acyl-CoAs of various carbon chain lengths to free fatty acids and CoA-SH. These enzymes are predicted to have a domain which adopts the α/β hydrolases fold. As a class, the acyl-CoA thioesterase enzymes have a conserved nucleophile-histidine-acid catalytic triad consisting of the amino acids: serine, histidine, and aspartic acid. Some examples of acyl-CoA thioesterase are the murine enzymes, CTE-I, MTE-I, and PTE-Ia/b which are localized to the cytoplasm, mitochondria, and peroxisomes of cells respectively (Hunt et al. (1999) J Biol Chem 274:34317-26). PTE-Ia/b transcription is controlled by the peroxisome proliferator activated receptor-α (PPAR), which is itself a key regulator of lipid metabolism (Id). Given their catalytic activity and their transcriptional regulation, the acyl-CoA thioesterase enzymes are evidently closely linked to lipid metabolism.

Summary of the 56939 Invention

[3096] The present invention is based, in part, on the discovery of a novel acyl-CoA thioesterase family member, referred to herein as “56939”. The nucleotide sequence of a cDNA encoding 56939 is shown in SEQ ID NO:48, and the amino acid sequence of a 56939 polypeptide is shown in SEQ ID NO:49. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:50.

[3097] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 56939 protein or polypeptide, e.g., a biologically active portion of the 56939 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:49. In other embodiments, the invention provides isolated 56939 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:48, SEQ ID NO:50, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:48, SEQ ID NO:50, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:48, SEQ ID NO:50, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 56939 protein or an active fragment thereof.

[3098] In a related aspect, the invention further provides nucleic acid constructs that include a 56939 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 56939 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 56939 nucleic acid molecules and polypeptides.

[3099] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 56939-encoding nucleic acids.

[3100] In still another related aspect, isolated nucleic acid molecules that are antisense to a 56939 encoding nucleic acid molecule are provided.

[3101] In another aspect, the invention features, 56939 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 56939-mediated or -related disorders. In another embodiment, the invention provides 56939 polypeptides having a 56939 activity. Preferred polypeptides are 56939 proteins including at least one acyl-CoA thioesterase domain, and, preferably, having a 56939 activity, e.g., a 56939 activity as described herein.

[3102] In other embodiments, the invention provides 56939 polypeptides, e.g., a 56939 polypeptide having the amino acid sequence shown in SEQ ID NO:49 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:49 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:48, SEQ ID NO:50, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 56939 protein or an active fragment thereof.

[3103] In a related aspect, the invention further provides nucleic acid constructs that include a 56939 nucleic acid molecule described herein.

[3104] In a related aspect, the invention provides 56939 polypeptides or fragments operatively linked to non-56939 polypeptides to form fusion proteins.

[3105] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 56939 polypeptides or fragments thereof. In one embodiment, the antibodies or antigen-binding fragment thereof competitively inhibit the binding of a second antibody to a 56939 polypeptide or a fragment thereof.

[3106] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 56939 polypeptides or nucleic acids.

[3107] In still another aspect, the invention provides a process for modulating 56939 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 56939 polypeptides or nucleic acids, such as conditions involving aberrant or deficient metabolism, detoxification, and cellular proliferation or differentiation.

[3108] The invention also provides assays for determining the activity of or the presence or absence of 56939 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis.

[3109] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing, of a 56939-expressing cell, e.g., a hyper-proliferative 56939-expressing cell. The method includes contacting the cell with a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 56939 polypeptide or nucleic acid. In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol. In a preferred embodiment, the cell is a hyperproliferative cell, e.g., a cell found in a solid tumor, a soft tissue tumor, or a metastatic lesion.

[3110] In a preferred embodiment, the compound is an inhibitor of a 56939 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another preferred embodiment, the compound is an inhibitor of a 56939 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[3111] In a preferred embodiment, the compound is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[3112] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant cellular proliferation or differentiation of a 56939-expressing cell, in a subject. Preferably, the method includes administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 56939 polypeptide or nucleic acid. In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition.

[3113] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., proliferative disorder or a differentiative disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 56939 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 56939 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 56939 nucleic acid or polypeptide expression can be detected by any method described herein.

[3114] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 56939 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[3115] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 56939 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 56939 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 56939 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue or kidney, liver, heart, brain, colon, breast, ovary, prostate, lung, skin or skeletal muscle tissue.

[3116] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 56939 polypeptide or nucleic acid molecule, including for disease diagnosis.

[3117] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 56939 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 56939 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 56939 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[3118] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 56939

[3119] The human 56939 sequence (Example 34; SEQ ID NO:48), which is approximately 1391 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 1266 nucleotides, including the termination codon (nucleotides indicated as coding of SEQ ID NO:48 in Example 34; SEQ ID NO:50). The coding sequence encodes a 421 amino acid protein (SEQ ID NO:49), the human 56939 protein of SEQ ID NO:49 and Example 35.

[3120] Human 56939 contains the following regions or other structural features:

[3121] an acyl-CoA thioesterase domain (ProDomain Accession PD006914) located at about amino acid residues 1 to 415 of SEQ ID NO:49;

[3122] three predicted N-glycosylation sites located at about amino acids 202 to 205, 247 to 250, and 255 to 258 of SEQ ID NO:49;

[3123] five predicted Protein Kinase C phosphorylation sites (PS00005) at about amino acids 33 to 35, about 37 to 39, about 138 to 140, about 339 to 341, and about 413 to 415 of SEQ ID NO:49;

[3124] one cAMP/cGMP dependent protein kinase phosphorylation sites (PS00004) at about amino acids 135 to 138 of SEQ ID NO:49;

[3125] two predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino 37 to 40, and about 337 to 340 of SEQ ID NO:49; and

[3126] six predicted N-myristoylation sites (PS00008) from about amino 69 to 74, about 79 to 84, about 165 to 170, about 230 to 235, about 256 to 261, and about 412 to 417 of SEQ ID NO:49.

[3127] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[3128] A plasmid containing the nucleotide sequence encoding human 56939 (clone “Fbh56939FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[3129] The 56939 protein contains a significant number of structural characteristics in common with members of the acyl-CoA thioesterase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[3130] Members of the acyl-CoA thioesterase family of proteins are characterized by a common fold, which, in particular, includes an α/β hydrolases fold of an eight stranded β-sheet with surrounding α-helices. Family members also frequently have common catalytic residues. These residues are found within the α/β hydrolases fold. The most critical catalytic residues are the catalytic triad which are positioned to favor nucleophilic attack of bound substrates. Within the acyl-CoA thioesterase family, the nucleophilic side chain is preferably a serine, but can be a cysteine; the remaining residues of the triad are preferably a histidine and an aspartic acid. Family members preferably catalyze the hydrolysis of acyl-CoA substrates, e.g., a fatty acid esterified with coenzyme A (CoA), e.g., palmitoyl-CoA, and bile salts conjugated with CoA.

[3131] A 56939 polypeptide can include a “acyl-CoA thioesterase domain” or regions homologous with a “acyl-CoA thioesterase domain.”

[3132] As used herein, the term “acyl-CoA thioesterase domain” includes an amino acid sequence of about 300 to 500 amino acid residues in length and having a score for the alignment of the sequence to the acyl-CoA thioesterase domain (PD0006914) of at least 400. Preferably, an acyl-CoA thioesterase domain includes at least one, two, and preferably three conserved catalytic residues positioned to favor the nucleophilic attack of bound substrates. The nucleophilic side chain is preferably a serine, but can be a cysteine; the remaining residues of the triad are preferably a histidine and an aspartic acid. For example, a 56939 polypeptide has a serine residue located at about residue 232 of SEQ ID NO:49, an aspartic acid residue located at about residue 325 of SEQ ID NO:49, and a histidine residue locate at about residue 360 of SEQ ID NO:49, corresponding to the conserved catalytic triad characteristic of acyl-CoA thioesterase domains. Preferably, an acyl-CoA thioesterase domain includes at least about 300 to 500 amino acids, more preferably about 350 to 450 amino acid residues, or about 390 to 420 amino acids and has a score for the alignment of the sequence to the acyl-CoA thioesterase domain (PD0006914) of at least 400, preferably 600, more preferably 800, more preferably 1000 or greater. The acyl-CoA thioesterase domain has been assigned the ProDomain Accession number PD0006914 (http://protein.toulouse.inra.fr/prodom.html). An alignment of the acyl-CoA thioesterase domain (amino acids about 1 to 415 of SEQ ID NO:49) of human 56939 with a consensus amino acid sequence is depicted in FIG. 25.

[3133] In a preferred embodiment 56939 polypeptide or protein has a “acyl-CoA thioesterase domain” or a region which includes at least about 300 to 500 amino acids, more preferably about 350 to 450 amino acids, more preferably about 380 to 430 or 400 to 415 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “acyl-CoA thioesterase domain,” e.g., the acyl-CoA thioesterase domain of human 56939 (e.g., residues 1 to 415 of SEQ ID NO:49).

[3134] To identify the presence of a “acyl-CoA thioesterase” domain in a 56939 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of domains, e.g., the ProDom database (Corpet et al. (1999), Nucl. Acids Res. 27:263-267) The ProDom protein domain database consists of an automatic compilation of homologous domains. Current versions of ProDom are built using recursive PSI-BLAST searches (Altschul S F et al. (1997) Nucleic Acids Res. 25:3389-3402; Gouzy et al. (1999) Computers and Chemistry 23:333-340.) of the SWISS-PROT 38 and TREMBL protein databases. The database automatically generates a consensus sequence for each domain. A BLAST search was performed against the ProDom database resulting in the identification of a “acyl-CoA thioesterase” domain in the amino acid sequence of human 56939 at about residues 1 to 415 of SEQ ID NO:49 (see FIG. 25).

[3135] A 56939 family member can include at least one acyl-CoA thioesterase domain (PD006914). Furthermore, a 56939 family member can include at least one, two, three, four preferably five protein kinase C phosphorylation sites (PS00005); at least one, preferably two predicted casein kinase II phosphorylation sites (PS00006); at least one, two, three, four, five and preferably six predicted N-myristylation sites (PS00008); at least one, two, preferably three predicted N-glycosylation sites; and at least one cAMP/cGMP dependent protein kinase phosphorylation site (PS00004).

[3136] As the 56939 polypeptides of the invention may modulate 56939-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 56939-mediated or related disorders, as described below.

[3137] As used herein, a “56939 activity”, “biological activity of 56939” or “functional activity of 56939”, refers to an activity exerted by a 56939 protein, polypeptide or nucleic acid molecule. For example, a 56939 activity can be an activity exerted by 56939 in a physiological milieu on, e.g., a 56939-responsive cell or on a 56939 substrate, e.g., a protein substrate. A 56939 activity can be determined in vivo or in vitro. In one embodiment, a 56939 activity is a direct activity, such as an association with a 56939 target molecule. A “target molecule” or “binding partner” is a molecule with which a 56939 protein binds or interacts in nature. In an exemplary embodiment, 56939 is an enzyme for lipid substrates; fatty acid-CoAs and/or bile acid-CoAs.

[3138] A 56939 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 56939 protein with a 56939 receptor. The features of the 56939 molecules of the present invention can provide similar biological activities as acyl-CoA thioesterase family members. For example, the 56939 proteins of the present invention can have or mediate one or more of the following activities: (1) hydrolysis of fatty acid substrates, e.g., fatty acid-CoA conjugates (2) hydrolysis of lipids; (3) hydrolysis of bile salt-CoA conjugates; or (4) N-acyltransferase activity, e.g., mediating the formation of N-acyl bile acid conjugates.

[3139] Based on the above-described sequence similarities, the 56939 molecules of the present invention are predicted to have similar biological activities as acyl-CoA thioesterase family members. For example, the 56939 protein may bind fatty acids, e.g., palmitoyl, and/or other lipid metabolic enzymes, e.g., fatty acid synthases. Accordingly, the 56939 protein may mediate one or more of the following physiological processes: metabolite regulation, detoxification, cardio-vascular function, liver function, and/or kidney function. Furthermore, as described in Example 35, the 56939 protein is highly expressed in tissues including kidney, liver, adipose, brain, and tumor cells.

[3140] The 56939 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative, metabolic, cardio-vascular, hepatic, kidney, and/or brain disorders.

[3141] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[3142] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth. Examples of such cells include cells having an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[3143] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[3144] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[3145] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[3146] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin. A hematopoietic neoplastic disorder can arise from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[3147] As used herein, disorders involving the heart, or “cardiovascular disease” or a “cardiovascular disorder” includes a disease or disorder which affects the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. A cardiovascular disorder includes, but is not limited to disorders such as arteriosclerosis, atherosclerosis, cardiac hypertrophy, ischemia reperfusion injury, restenosis, arterial inflammation, vascular wall remodeling, ventricular remodeling, rapid ventricular pacing, coronary microembolism, tachycardia, bradycardia, pressure overload, aortic bending, coronary artery ligation, vascular heart disease, valvular disease, including but not limited to, valvular degeneration caused by calcification, rheumatic heart disease, endocarditis, or complications of artificial valves; atrial fibrillation, long-QT syndrome, congestive heart failure, sinus node dysfunction, angina, heart failure, hypertension, atrial fibrillation, atrial flutter, pericardial disease, including but not limited to, pericardial effusion and pericarditis; cardiomyopathies, e.g., dilated cardiomyopathy or idiopathic cardiomyopathy, myocardial infarction, coronary artery disease, coronary artery spasm, ischemic disease, arrhythmia, sudden cardiac death, and cardiovascular developmental disorders (e.g., arteriovenous malformations, arteriovenous fistulae, raynaud's syndrome, neurogenic thoracic outlet syndrome, causalgia/reflex sympathetic dystrophy, hemangioma, aneurysm, cavernous angioma, aortic valve stenosis, atrial septal defects, atrioventricular canal, coarctation of the aorta, ebsteins anomaly, hypoplastic left heart syndrome, interruption of the aortic arch, mitral valve prolapse, ductus arteriosus, patent foramen ovale, partial anomalous pulmonary venous return, pulmonary atresia with ventricular septal defect, pulmonary atresia without ventricular septal defect, persistance of the fetal circulation, pulmonary valve stenosis, single ventricle, total anomalous pulmonary venous return, transposition of the great vessels, tricuspid atresia, truncus arteriosus, ventricular septal defects). A cardiovasular disease or disorder also can include an endothelial cell disorder.

[3148] Disorders involving the heart, include but are not limited to, heart failure, including but not limited to, cardiac hypertrophy, left-sided heart failure, and right-sided heart failure; ischemic heart disease, including but not limited to angina pectoris, myocardial infarction, chronic ischemic heart disease, and sudden cardiac death; hypertensive heart disease, including but not limited to, systemic (left-sided) hypertensive heart disease and pulmonary (right-sided) hypertensive heart disease; valvular heart disease, including but not limited to, valvular degeneration caused by calcification, such as calcific aortic stenosis, calcification of a congenitally bicuspid aortic valve, and mitral annular calcification, and myxomatous degeneration of the mitral valve (mitral valve prolapse), rheumatic fever and rheumatic heart disease, infective endocarditis, and noninfected vegetations, such as nonbacterial thrombotic endocarditis and endocarditis of systemic lupus erythematosus (Libman-Sacks disease), carcinoid heart disease, and complications of artificial valves; myocardial disease, including but not limited to dilated cardiomyopathy, hypertrophic cardiomyopathy, restrictive cardiomyopathy, and myocarditis; pericardial disease, including but not limited to, pericardial effusion and hemopericardium and pericarditis, including acute pericarditis and healed pericarditis, and rheumatoid heart disease; neoplastic heart disease, including but not limited to, primary cardiac tumors, such as myxoma, lipoma, papillary fibroelastoma, rhabdomyoma, and sarcoma, and cardiac effects of noncardiac neoplasms; congenital heart disease, including but not limited to, left-to-right shunts—late cyanosis, such as atrial septal defect, ventricular septal defect, patent ductus arteriosus, and atrioventricular septal defect, right-to-left shunts—early cyanosis, such as tetralogy of fallot, transposition of great arteries, truncus arteriosus, tricuspid atresia, and total anomalous pulmonary venous connection, obstructive congenital anomalies, such as coarctation of aorta, pulmonary stenosis and atresia, and aortic stenosis and atresia, and disorders involving cardiac transplantation.

[3149] Additional examples of vascular disorders include, but are not limited to, responses of vascular cell walls to injury, such as endothelial dysfunction and endothelial activation and intimal thickening; vascular diseases including, but not limited to, congenital anomalies, such as arteriovenous fistula, atherosclerosis, and hypertensive vascular disease, such as hypertension; inflammatory disease—the vasculitides, such as giant cell (temporal) arteritis, Takayasu arteritis, polyarteritis nodosa (classic), Kawasaki syndrome (mucocutaneous lymph node syndrome), microscopic polyanglitis (microscopic polyarteritis, hypersensitivity or leukocytoclastic anglitis), Wegener granulomatosis, thromboanglitis obliterans (Buerger disease), vasculitis associated with other disorders, and infectious arteritis; Raynaud disease; aneurysms and dissection, such as abdominal aortic aneurysms, syphilitic (luetic) aneurysms, and aortic dissection (dissecting hematoma); disorders of veins and lymphatics, such as varicose veins, thrombophlebitis and phlebothrombosis, obstruction of superior vena cava (superior vena cava syndrome), obstruction of inferior vena cava (inferior vena cava syndrome), and lymphangitis and lymphedema; tumors, including benign tumors and tumor-like conditions, such as hemangioma, lymphangioma, glomus tumor (glomangioma), vascular ectasias, and bacillary angiomatosis, and intermediate-grade (borderline low-grade malignant) tumors, such as Kaposi sarcoma and hemangloendothelioma, and malignant tumors, such as angiosarcoma and hemangiopericytoma; and pathology of therapeutic interventions in vascular disease, such as balloon angioplasty and related techniques and vascular replacement, such as coronary artery bypass graft surgery.

[3150] As 56939 is highly expressed in liver tissue, 56939 may be involved in disorders involving the liver. Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[3151] Disorders which may be treated or diagnosed by methods described herein also include disorders of metabolism. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. For example, many acyl-CoA thioesterase family members have lipase activity and can oxidize fatty acids. 56939 may further be involved in hereditary diseases, e.g., neuronal ceroid lipofuscinosis (Batten's disease). Infantile neuronal ceroid lipofuscinosis can be caused by a defect in a number of genes, including PPT1, a palmitoyl protein thioesterase.

[3152] 56939 may have a role in removing xenobiotic epoxides and/or other toxins from the body. Furthermore, 56939 may contribute to the metabolism of drugs and other pharmaceuticals. The study of polymorphisms in the 56939 gene can provide a useful resource for pharmacogenomic (see below) analysis of drug responses. Additionally, variations in 56939 may contribute to population differences in sensitivity to environmental toxins.

[3153] 56939 is highly expressed in the brain cortex and hypothalamus. Thus, 56939 may be involved in disorders involving these tissues, e.g. brain disorders. Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B₁) deficiency and vitamin B₁₂ deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[3154] As 56939 is strongly expressed in kidney tissue, 56939 may be involved disorders involving the kidney. Disorders involving the kidney include, but are not limited to, congenital anomalies including, but not limited to, cystic diseases of the kidney, that include but are not limited to, cystic renal dysplasia, autosomal dominant (adult) polycystic kidney disease, autosomal recessive (childhood) polycystic kidney disease, and cystic diseases of renal medulla, which include, but are not limited to, medullary sponge kidney, and nephronophthisis-uremic medullary cystic disease complex, acquired (dialysis-associated) cystic disease, such as simple cysts; glomerular diseases including pathologies of glomerular injury that include, but are not limited to, in situ immune complex deposition, that includes, but is not limited to, anti-GBM nephritis, Heymann nephritis, and antibodies against planted antigens, circulating immune complex nephritis, antibodies to glomerular cells, cell-mediated immunity in glomerulonephritis, activation of alternative complement pathway, epithelial cell injury, and pathologies involving mediators of glomerular injury including cellular and soluble mediators, acute glomerulonephritis, such as acute proliferative (poststreptococcal, postinfectious) glomerulonephritis, including but not limited to, poststreptococcal glomerulonephritis and nonstreptococcal acute glomerulonephritis, rapidly progressive (crescentic) glomerulonephritis, nephrotic syndrome, membranous glomerulonephritis (membranous nephropathy), minimal change disease (lipoid nephrosis), focal segmental glomerulosclerosis, membranoproliferative glomerulonephritis, IgA nephropathy (Berger disease), focal proliferative and necrotizing glomerulonephritis (focal glomerulonephritis), hereditary nephritis, including but not limited to, Alport syndrome and thin membrane disease (benign familial hematuria), chronic glomerulonephritis, glomerular lesions associated with systemic disease, including but not limited to, systemic lupus erythematosus, Henoch-Schönlein purpura, bacterial endocarditis, diabetic glomerulosclerosis, amyloidosis, fibrillary and immunotactoid glomerulonephritis, and other systemic disorders; diseases affecting tubules and interstitium, including acute tubular necrosis and tubulointerstitial nephritis, including but not limited to, pyelonephritis and urinary tract infection, acute pyelonephritis, chronic pyelonephritis and reflux nephropathy, and tubulointerstitial nephritis induced by drugs and toxins, including but not limited to, acute drug-induced interstitial nephritis, analgesic abuse nephropathy, nephropathy associated with nonsteroidal anti-inflammatory drugs, and other tubulointerstitial diseases including, but not limited to, urate nephropathy, hypercalcemia and nephrocalcinosis, and multiple myeloma; diseases of blood vessels including benign nephrosclerosis, malignant hypertension and accelerated nephrosclerosis, renal artery stenosis, and thrombotic microangiopathies including, but not limited to, classic (childhood) hemolytic-uremic syndrome, adult hemolytic-uremic syndrome/thrombotic thrombocytopenic purpura, idiopathic HUS/TTP, and other vascular disorders including, but not limited to, atherosclerotic ischemic renal disease, atheroembolic renal disease, sickle cell disease nephropathy, diffuse cortical necrosis, and renal infarcts; urinary tract obstruction (obstructive uropathy); urolithiasis (renal calculi, stones); and tumors of the kidney including, but not limited to, benign tumors, such as renal papillary adenoma, renal fibroma or hamartoma (renomedullary interstitial cell tumor), angiomyolipoma, and oncocytoma, and malignant tumors, including renal cell carcinoma (hypemephroma, adenocarcinoma of kidney), which includes urothelial carcinomas of renal pelvis.

[3155] The 56939 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:49 thereof are collectively referred to as “polypeptides or proteins of the invention” or “56939 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “56939 nucleic acids.” 56939 molecules refer to 56939 nucleic acids, polypeptides, and antibodies.

[3156] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[3157] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[3158] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[3159] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO:48 or SEQ ID NO:50, corresponds to a naturally-occurring nucleic acid molecule.

[3160] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein.

[3161] As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding a 56939 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 56939 protein or derivative thereof.

[3162] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 56939 protein is at least 10% pure. In a preferred embodiment, the preparation of 56939 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-56939 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-56939 chemicals. When the 56939 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[3163] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 56939 without abolishing or substantially altering a 56939 activity. Preferably the alteration does not substantially alter the 56939 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 56939, results in abolishing a 56939 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 56939 are predicted to be particularly unamenable to alteration.

[3164] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 56939 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 56939 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 56939 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:48 or SEQ ID NO:50, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[3165] As used herein, a “biologically active portion” of a 56939 protein includes a fragment of a 56939 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 56939 molecule and a non-56939 molecule or between a first 56939 molecule and a second 56939 molecule (e.g., a dimerization interaction). Biologically active portions of a 56939 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 56939 protein, e.g., the amino acid sequence shown in SEQ ID NO:49, which include less amino acids than the full length 56939 proteins, and exhibit at least one activity of a 56939 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 56939 protein, e.g., hydrolysis of fatty acid-CoA substrates. A biologically active portion of a 56939 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 56939 protein can be used as targets for developing agents which modulate a 56939 mediated activity, e.g., hydrolysis of fatty acid-CoA substrates.

[3166] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[3167] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[3168] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[3169] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[3170] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[3171] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 56939 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 56939 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[3172] Particular 56939 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:49. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:49 are termed substantially identical.

[3173] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:48 or 50 are termed substantially identical.

[3174] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[3175] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[3176] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[3177] Various aspects of the invention are described in further detail below.

[3178] Isolated Nucleic Acid Molecules of 56939

[3179] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 56939 polypeptide described herein, e.g., a full-length 56939 protein or a fragment thereof, e.g., a biologically active portion of 56939 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 56939 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[3180] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:48, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 56939 protein (i.e., “the coding region” of SEQ ID NO:48, as shown in SEQ ID NO:50), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:48 (e.g., SEQ ID NO:50) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of protein from about amino acid 1 to amino acid 415 of SEQ ID NO:49.

[3181] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:48 or SEQ ID NO:50, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:48 or SEQ ID NO:50, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:48 or 50, thereby forming a stable duplex.

[3182] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about: 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:48 or SEQ ID NO:50, or a portion, preferably of the same length, of any of these nucleotide sequences.

[3183] 56939 Nucleic Acid Fragments

[3184] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:48 or 50. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of a 56939 protein, e.g., an immunogenic or biologically active portion of a 56939 protein. A fragment can comprise those nucleotides of SEQ ID NO:48, which encode a acyl-CoA thioesterase domain of human 56939. The nucleotide sequence determined from the cloning of the 56939 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 56939 family members, or fragments thereof, as well as 56939 homologues, or fragments thereof, from other species.

[3185] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 100 amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[3186] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 56939 nucleic acid fragment can include a sequence corresponding to a acyl-CoA thioesterase domain.

[3187] 56939 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:48 or SEQ ID NO:50, or of a naturally occurring allelic variant or mutant of SEQ ID NO:48 or SEQ ID NO:50.

[3188] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[3189] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes an acyl-CoA thioesterase domain from about amino acid 1 to about amino acid 415 of SEQ ID NO:49.

[3190] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 56939 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of the following region are provided: a acyl-CoA thioesterase domain from about amino acid 1 to 415 of SEQ ID NO:49.

[3191] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[3192] A nucleic acid fragment encoding a “biologically active portion of a 56939 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:48 or 50, which encodes a polypeptide having a 56939 biological activity (e.g., the biological activities of the 56939 proteins are described herein), expressing the encoded portion of the 56939 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 56939 protein. For example, a nucleic acid fragment encoding a biologically active portion of 56939 includes a acyl-CoA thioesterase domain, e.g., amino acid residues about 1 to 415 of SEQ ID NO:49. A nucleic acid fragment encoding a biologically active portion of a 56939 polypeptide, may comprise a nucleotide sequence which is greater than 300, 150, 400 or more nucleotides in length.

[3193] In preferred embodiments, the fragment includes at least one, and preferably at least 5, 10, 15, 25, 50, 100, 200, 300, 350, 400, 450, or 500 nucleotides from nucleotides 1-668, 1-795, 1-828, 1-440, 1008-1391, or 1-986 of SEQ ID NO:48.

[3194] In preferred embodiments, the fragment includes the nucleotide sequence of SEQ ID NO:50 and at least one, and preferably at least 5, 10, 15, 25, 50, 75, 100, 200, 300, or 500 consecutive nucleotides of SEQ ID NO:48.

[3195] In preferred embodiments, the fragment includes at least one, and preferably at least 5, 10, 15, 25, 50, 75, 100, 200, 300, 500, 1000, 1100, 1200, or 1300 nucleotides encoding a protein including 5, 10, 15, 20, 25, 30, 40, 50, 100, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, or 420 consecutive amino acids of SEQ ID NO:49.

[3196] In preferred embodiments, a nucleic acid includes a nucleotide sequence that is other than the sequence of A1830209, AA056177, AA639200, AW090080, AW959783, AW959780, or V88752.

[3197] In preferred embodiments, the fragment comprises the coding region of 56939, e.g., the nucleotide sequence of SEQ ID NO:50.

[3198] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300 or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:48, or SEQ ID NO:50.

[3199] 56939 Nucleic Acid Variants

[3200] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:48 or SEQ ID NO:50. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 56939 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:49. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[3201] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[3202] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[3203] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:48 or 50, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[3204] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO:49 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO:49 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 56939 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 56939 gene.

[3205] Preferred variants include those that are correlated with hydrolase catalytic activity including the ability to hydrolyze fatty acids, e.g., fatty acid-CoAs or bile acid-CoAs.

[3206] Allelic variants of 56939, e.g., human 56939, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 56939 protein within a population that maintain the ability to hydrolyze fatty acids, e.g., fatty acid-CoAs or bile acid-CoAs. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:49, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 56939, e.g., human 56939, protein within a population that do not have the ability to hydrolyze fatty acids, e.g., fatty acid-CoAs or bile acid-CoAs. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:49, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[3207] Moreover, nucleic acid molecules encoding other 56939 family members and, thus, which have a nucleotide sequence which differs from the 56939 sequences of SEQ ID NO:48 or SEQ ID NO:50 are intended to be within the scope of the invention.

[3208] Antisense Nucleic Acid Molecules, Ribozymes and Modified 56939 Nucleic Acid Molecules

[3209] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 56939. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 56939 coding strand, or to only a portion thereof (e.g., the coding region of human 56939 corresponding to SEQ ID NO:50). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 56939 (e.g., the 5′ and 3′untranslated regions).

[3210] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 56939 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 56939 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 56939 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[3211] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[3212] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 56939 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[3213] In yet another embodiment, the antisense nucleic acid molecule of the invention is an α-anomeric nucleic acid molecule. An α-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual α-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[3214] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 56939-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 56939 cDNA disclosed herein (i.e., SEQ ID NO:48 or SEQ ID NO:50), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 56939-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 56939 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[3215] 56939 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 56939 (e.g., the 56939 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 56939 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[3216] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or calorimetric.

[3217] A 56939 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[3218] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[3219] PNAs of 56939 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 56939 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[3220] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[3221] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 56939 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 56939 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[3222] Isolated 56939 Polypeptides

[3223] In another aspect, the invention features, an isolated 56939 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-56939 antibodies. 56939 protein can be isolated from cells or tissue sources using standard protein purification techniques. 56939 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[3224] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[3225] In a preferred embodiment, a 56939 polypeptide has one or more of the following characteristics:

[3226] (i) it has the ability to hydrolyze acyl-CoA molecules to free fatty acids and CoA;

[3227] (ii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post-translational modifications, amino acid composition or other physical characteristic of SEQ ID NO:49;

[3228] (iii) it has an overall sequence similarity of at least 60%, more preferably at least 705, 805, 905, or 95%, with a polypeptide of SEQ ID NO:49;

[3229] (iv) it has an acyl-CoA thioesterase domain which is preferably about 70%, 80%, 90% or 95% identical to amino acid residues about 1 to 415 of SEQ ID NO:49;

[3230] (v) it has a serine corresponding to the conserved serine of the catalytic triad located at about position 232 of SEQ ID NO:49;

[3231] (vi) it has a histidine corresponding to the conserved histidine of the catalytic triad located at about position 360 of SEQ ID NO:49;

[3232] (vii) it has a aspartic acid corresponding to the conserved aspartic acid of the catalytic triad located at about position 325 of SEQ ID NO:49; or

[3233] (viii) it has at least 70%, preferably 80%, and most preferably 95% of the cysteines found amino acid sequence of the native protein.

[3234] In a preferred embodiment the 56939 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:49. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:49 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:49. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the acyl-CoA thioesterase domain. In another preferred embodiment one or more differences are in the acyl-CoA thioesterase domain.

[3235] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 56939 proteins differ in amino acid sequence from SEQ ID NO:49, yet retain biological activity.

[3236] In one embodiment, the protein includes an amino acid sequence at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:49.

[3237] A 56939 protein or fragment is provided which varies from the sequence of SEQ ID NO:49 in regions defined by amino acids about 415 to 421 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:49 in regions defined by amino acids about 1 to about 415. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[3238] In one embodiment, a biologically active portion of a 56939 protein includes a acyl-CoA thioesterase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 56939 protein.

[3239] In a preferred embodiment, the 56939 protein has an amino acid sequence shown in SEQ ID NO:49. In other embodiments, the 56939 protein is substantially identical to SEQ ID NO:49. In yet another embodiment, the 56939 protein is substantially identical to SEQ ID NO:49 and retains the functional activity of the protein of SEQ ID NO:49, as described in detail in the subsections above.

[3240] 56939 Chimeric or Fusion Proteins

[3241] In another aspect, the invention provides 56939 chimeric or fusion proteins. As used herein, a 56939 “chimeric protein” or “fusion protein” includes a 56939 polypeptide linked to a non-56939 polypeptide. A “non-56939 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 56939 protein, e.g., a protein which is different from the 56939 protein and which is derived from the same or a different organism. The 56939 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 56939 amino acid sequence. In a preferred embodiment, a 56939 fusion protein includes at least one (or two) biologically active portion of a 56939 protein. The non-56939 polypeptide can be fused to the N-terminus or C-terminus of the 56939 polypeptide.

[3242] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-56939 fusion protein in which the 56939 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 56939. Alternatively, the fusion protein can be a 56939 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 56939 can be increased through use of a heterologous signal sequence.

[3243] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[3244] The 56939 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 56939 fusion proteins can be used to affect the bioavailability of a 56939 substrate. 56939 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 56939 protein; (ii) mis-regulation of the 56939 gene; and (iii) aberrant post-translational modification of a 56939 protein.

[3245] Moreover, the 56939-fusion proteins of the invention can be used as immunogens to produce anti-56939 antibodies in a subject, to purify 56939 ligands and in screening assays to identify molecules which inhibit the interaction of 56939 with a 56939 substrate.

[3246] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 56939-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 56939 protein.

[3247] Variants of 56939 Proteins

[3248] In another aspect, the invention also features a variant of a 56939 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 56939 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 56939 protein. An agonist of the 56939 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 56939 protein. An antagonist of a 56939 protein can inhibit one or more of the activities of the naturally occurring form of the 56939 protein by, for example, competitively modulating a 56939-mediated activity of a 56939 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 56939 protein.

[3249] Variants of a 56939 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 56939 protein for agonist or antagonist activity.

[3250] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 56939 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 56939 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[3251] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 56939 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 56939 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[3252] Cell based assays can be exploited to analyze a variegated 56939 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 56939 in a substrate-dependent manner. The transfected cells are then contacted with 56939 and the effect of the expression of the mutant on signaling by the 56939 substrate can be detected, e.g., by measuring hydrolysis of acyl-CoAs. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 56939 substrate, and the individual clones further characterized.

[3253] In another aspect, the invention features a method of making a 56939 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 56939 polypeptide, e.g., a naturally occurring 56939 polypeptide. The method includes: altering the sequence of a 56939 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[3254] In another aspect, the invention features a method of making a fragment or analog of a 56939 polypeptide a biological activity of a naturally occurring 56939 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 56939 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[3255] Anti-56939 Antibodies

[3256] In another aspect, the invention provides an anti-56939 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[3257] The anti-56939 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[3258] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[3259] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 56939 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-56939 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)₂ fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[3260] The anti-56939 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[3261] Phage display and combinatorial methods for generating anti-56939 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[3262] In one embodiment, the anti-56939 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[3263] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[3264] An anti-56939 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[3265] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[3266] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 56939 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto. As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[3267] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 56939 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[3268] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[3269] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[3270] In preferred embodiments an antibody can be made by immunizing with purified 56939 antigen, or a fragment thereof, e.g., a fragment described herein.

[3271] A full-length 56939 protein or, antigenic peptide fragment of 56939 can be used as an immunogen or can be used to identify anti-56939 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 56939 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:49 and encompasses an epitope of 56939. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[3272] Fragments of 56939 which include residues 110 to 115, about 323 to 330, or about 339 to 349 can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody, antibodies against hydrophilic regions of the 56939 protein. Similarly, fragments of 56939 which include residues 78 to 83, about 100 to 104, or about 223 to 231 can be used to make an antibody against a hydrophobic region of the 56939 protein. A fragment of 56939 which include residues about 1 to 415 can be used to make an antibody against the acyl-CoA thioesterase region of the 56939 protein.

[3273] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[3274] Antibodies which bind only native 56939 protein, only denatured or otherwise non-native 56939 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 56939 protein.

[3275] Preferred epitopes encompassed by the antigenic peptide are regions of 56939 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 56939 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 56939 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[3276] The anti-56939 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 56939 protein.

[3277] In a preferred embodiment the antibody has: effector function; and can fix complement. In other embodiments the antibody does not; recruit effector cells; or fix complement.

[3278] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example., it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[3279] In a preferred embodiment, an anti-56939 antibody alters (e.g., increases or decreases) the activity of a 56939 polypeptide.

[3280] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[3281] An anti-56939 antibody (e.g., monoclonal antibody) can be used to isolate 56939 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-56939 antibody can be used to detect 56939 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-56939 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labeling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or ³H.

[3282] The invention also includes a nucleic acids which encodes an anti-56939 antibody, e.g., an anti-56939 antibody described herein. Also included are vectors which include the nucleic acid and sells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[3283] The invention also includes cell lines, e.g., hybridomas, which make an anti-56939 antibody, e.g., and antibody described herein, and method of using said cells to make a 56939 antibody.

[3284] 56939 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[3285] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[3286] A vector can include a 56939 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 56939 proteins, mutant forms of 56939 proteins, fusion proteins, and the like).

[3287] The recombinant expression vectors of the invention can be designed for expression of 56939 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[3288] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[3289] Purified fusion proteins can be used in 56939 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 56939 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[3290] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[3291] The 56939 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[3292] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[3293] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[3294] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the α-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[3295] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[3296] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 56939 nucleic acid molecule within a recombinant expression vector or a 56939 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[3297] A host cell can be any prokaryotic or eukaryotic cell. For example, a 56939 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells (African green monkey kidney cells CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182)). Other suitable host cells are known to those skilled in the art.

[3298] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[3299] A host cell of the invention can be used to produce (i.e., express) a 56939 protein. Accordingly, the invention further provides methods for producing a 56939 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 56939 protein has been introduced) in a suitable medium such that a 56939 protein is produced. In another embodiment, the method further includes isolating a 56939 protein from the medium or the host cell.

[3300] In another aspect, the invention features, a cell or purified preparation of cells which include a 56939 transgene, or which otherwise misexpress 56939. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 56939 transgene, e.g., a heterologous form of a 56939, e.g., a gene derived from humans (in the case of a non-human cell). The 56939 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 56939, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 56939 alleles or for use in drug screening.

[3301] In another aspect, the invention features, a human cell, e.g., a hematopoietic stem cell, transformed with nucleic acid which encodes a subject 56939 polypeptide.

[3302] Also provided are cells, preferably human cells, e.g., human hematopoietic or fibroblast cells, in which an endogenous 56939 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 56939 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 56939 gene. For example, an endogenous 56939 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[3303] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 56939 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 56939 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 56939 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[3304] 56939 Transgenic Animals

[3305] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 56939 protein and for identifying and/or evaluating modulators of 56939 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 56939 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[3306] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 56939 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 56939 transgene in its genome and/or expression of 56939 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 56939 protein can further be bred to other transgenic animals carrying other transgenes.

[3307] 56939 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[3308] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[3309] Uses of 56939

[3310] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[3311] The isolated nucleic acid molecules of the invention can be used, for example, to express a 56939 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 56939 mRNA (e.g., in a biological sample) or a genetic alteration in a 56939 gene, and to modulate 56939 activity, as described further below. The 56939 proteins can be used to treat disorders characterized by insufficient or excessive production of a 56939 substrate or production of 56939 inhibitors. In addition, the 56939 proteins can be used to screen for naturally occurring 56939 substrates, to screen for drugs or compounds which modulate 56939 activity, as well as to treat disorders characterized by insufficient or excessive production of 56939 protein or production of 56939 protein forms which have decreased, aberrant or unwanted activity compared to 56939 wild type protein (e.g., metabolic disorders). Moreover, the anti-56939 antibodies of the invention can be used to detect and isolate 56939 proteins, regulate the bioavailability of 56939 proteins, and modulate 56939 activity.

[3312] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 56939 polypeptide is provided. The method includes: contacting the compound with the subject 56939 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 56939 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 56939 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 56939 polypeptide. Screening methods are discussed in more detail below.

[3313] 56939 Screening Assays

[3314] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 56939 proteins, have a stimulatory or inhibitory effect on, for example, 56939 expression or 56939 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 56939 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 56939 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[3315] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of a 56939 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 56939 protein or polypeptide or a biologically active portion thereof.

[3316] In one embodiment, an activity of a 56939 protein can be assayed using an assay system acceptable for detecting thioesterase activity. For example, thioesterase activity can be assessed radiochemically by extracting and assaying the [14C]palmitic acid formed from [1-14C]palmitoyl-CoA during an incubation with a 56939 protein, fragment, or variant. For example, the assay system can contain 25 mM potassium phosphate buffer (pH 8), 20 ug/ml bovine serum albumin, 10 uM [1-14C]palmitoyl-CoA (20 nCi), and a 56939 protein in a final volume of about 0.1 ml. The duration of the incubation can be approximately three minutes. See, e.g., Smith, S., (1981) Methods Enzymol. 71C, 181-188 and Joshi et al. (1993), J. Biol. Chem 268: 22508-22513 for additional details regarding methods of assaying for thioesterase activity.

[3317] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[3318] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[3319] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[3320] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 56939 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 56939 activity is determined. Determining the ability of the test compound to modulate 56939 activity can be accomplished by monitoring, for example, hydrolytic activity. The cell, for example, can be of mammalian origin, e.g., human.

[3321] The ability of the test compound to modulate 56939 binding to a compound, e.g., a 56939 substrate, or to bind to 56939 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 56939 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 56939 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 56939 binding to a 56939 substrate in a complex. For example, compounds (e.g., 56939 substrates) can be labeled with ¹²⁵I, ³⁵S, ¹⁴C, or ³H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[3322] The ability of a compound (e.g., a 56939 substrate) to interact with 56939 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 56939 without the labeling of either the compound or the 56939. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 56939.

[3323] In yet another embodiment, a cell-free assay is provided in which a 56939 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 56939 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 56939 proteins to be used in assays of the present invention include fragments which participate in interactions with non-56939 molecules, e.g., fragments with high surface probability scores.

[3324] Soluble and/or membrane-bound forms of isolated proteins (e.g., 56939 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)_(n), 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[3325] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[3326] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[3327] In another embodiment, determining the ability of the 56939 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[3328] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[3329] It may be desirable to immobilize either 56939, an anti-56939 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 56939 protein, or interaction of a 56939 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/56939 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 56939 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 56939 binding or activity determined using standard techniques.

[3330] Other techniques for immobilizing either a 56939 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 56939 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[3331] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[3332] In one embodiment, this assay is performed utilizing antibodies reactive with 56939 protein or target molecules but which do not interfere with binding of the 56939 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 56939 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 56939 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 56939 protein or target molecule.

[3333] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11:141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[3334] In a preferred embodiment, the assay includes contacting the 56939 protein or biologically active portion thereof with a known compound which binds 56939 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 56939 protein, wherein determining the ability of the test compound to interact with a 56939 protein includes determining the ability of the test compound to preferentially bind to 56939 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[3335] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 56939 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 56939 protein through modulation of the activity of a downstream effector of a 56939 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[3336] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[3337] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[3338] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[3339] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[3340] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[3341] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[3342] In yet another aspect, the 56939 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 56939 (“56939-binding proteins” or “56939-bp”) and are involved in 56939 activity. Such 56939-bps can be activators or inhibitors of signals by the 56939 proteins or 56939 targets as, for example, downstream elements of a 56939-mediated signaling pathway.

[3343] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 56939 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 56939 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 56939-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 56939 protein.

[3344] In another embodiment, modulators of 56939 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 56939 mRNA or protein evaluated relative to the level of expression of 56939 mRNA or protein in the absence of the candidate compound. When expression of 56939 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 56939 mRNA or protein expression. Alternatively, when expression of 56939 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 56939 mRNA or protein expression. The level of 56939 mRNA or protein expression can be determined by methods described herein for detecting 56939 mRNA or protein.

[3345] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 56939 protein can be confirmed in vivo, e.g., in an animal such as an animal model for metabolic disorders.

[3346] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 56939 modulating agent, an antisense 56939 nucleic acid molecule, a 56939-specific antibody, or a 56939-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[3347] 56939 Detection Assays

[3348] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 56939 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[3349] 56939 Chromosome Mapping

[3350] The 56939 nucleotide sequences or portions thereof can be used to map the location of the 56939 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 56939 sequences with genes associated with disease.

[3351] Briefly, 56939 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 56939 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 56939 sequences will yield an amplified fragment.

[3352] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[3353] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 56939 to a chromosomal location.

[3354] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[3355] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[3356] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[3357] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 56939 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[3358] 56939 Tissue Typing

[3359] 56939 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[3360] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 56939 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[3361] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:48 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:50 are used, amore appropriate number of primers for positive individual identification would be 500-2,000.

[3362] If a panel of reagents from 56939 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[3363] Use of Partial 56939 Sequences in Forensic Biology

[3364] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[3365] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:48 (e.g., fragments derived from the noncoding regions of SEQ ID NO:48 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[3366] The 56939 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 56939 probes can be used to identify tissue by species and/or by organ type.

[3367] In a similar fashion, these reagents, e.g., 56939 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[3368] Predictive Medicine of 56939

[3369] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[3370] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 56939.

[3371] Such disorders include, e.g., a disorder associated with the misexpression of 56939 gene; a disorder of metabolism, a disorder of the cardiovascular or hepatic system; or any other 56939-related disorder described herein.

[3372] The method includes one or more of the following:

[3373] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 56939 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[3374] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 56939 gene;

[3375] detecting, in a tissue of the subject, the misexpression of the 56939 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[3376] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 56939 polypeptide.

[3377] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 56939 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[3378] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:48, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 56939 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[3379] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the. 56939 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 56939.

[3380] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[3381] In preferred embodiments the method includes determining the structure of a 56939 gene, an abnormal structure being indicative of risk for the disorder.

[3382] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 56939 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[3383] Diagnostic and Prognostic Assays of 56939

[3384] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 56939 molecules and for identifying variations and mutations in the sequence of 56939 molecules.

[3385] Expression Monitoring and Profiling:

[3386] The presence, level, or absence of 56939 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 56939 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 56939 protein such that the presence of 56939 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 56939 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 56939 genes; measuring the amount of protein encoded by the 56939 genes; or measuring the activity of the protein encoded by the 56939 genes.

[3387] The level of mRNA corresponding to the 56939 gene in a cell can be determined both by in situ and by in vitro formats.

[3388] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 56939 nucleic acid, such as the nucleic acid of SEQ ID NO:48, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 56939 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[3389] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 56939 genes.

[3390] The level of mRNA in a sample that is encoded by one of 56939 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[3391] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 56939 gene being analyzed.

[3392] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 56939 mRNA, or genomic DNA, and comparing the presence of 56939 mRNA or genomic DNA in the control sample with the presence of 56939 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 56939 transcript levels.

[3393] A variety of methods can be used to determine the level of protein encoded by 56939. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)₂) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[3394] The detection methods can be used to detect 56939 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 56939 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 56939 protein include introducing into a subject a labeled anti-56939 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-56939 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[3395] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 56939 protein, and comparing the presence of 56939 protein in the control sample with the presence of 56939 protein in the test sample.

[3396] The invention also includes kits for detecting the presence of 56939 in a biological sample. For example, the kit can include a compound or agent capable of detecting 56939 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 56939 protein or nucleic acid.

[3397] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[3398] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[3399] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 56939 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as a metabolism disorder or deregulated cell proliferation.

[3400] In one embodiment, a disease or disorder associated with aberrant or unwanted 56939 expression or activity is identified. A test sample is obtained from a subject and 56939 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 56939 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 56939 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[3401] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 56939 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a cell metabolism disorder.

[3402] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 56939 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 56939 (e.g., other genes associated with a 56939-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[3403] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 56939 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a disorder in a subject wherein an increase or decrease in 56939 expression is an indication that the subject has or is disposed to having a disorder. The method can be used to monitor a treatment for disorders, e.g. metabolism disorders in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[3404] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 56939 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[3405] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 56939 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[3406] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[3407] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 56939 expression.

[3408] 56939 Arrays and Uses Thereof

[3409] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 56939 molecule (e.g., a 56939 nucleic acid or a 56939 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm², and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[3410] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 56939 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 56939. Each address of the subset can include a capture probe that hybridizes to a different region of a 56939 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 56939 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 56939 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 56939 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[3411] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[3412] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 56939 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 56939 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-56939 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[3413] In another aspect, the invention features a method of analyzing the expression of 56939. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 56939-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[3414] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 56939. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 56939. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[3415] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 56939 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[3416] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[3417] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 56939-associated disease or disorder; and processes, such as a cellular transformation associated with a 56939-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 56939-associated disease or disorder

[3418] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 56939) that could serve as a molecular target for diagnosis or therapeutic intervention.

[3419] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 56939 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 56939 polypeptide or fragment thereof. For example, multiple variants of a 56939 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[3420] The polypeptide array can be used to detect a 56939 binding compound, e.g., an antibody in a sample from a subject with specificity for a 56939 polypeptide or the presence of a 56939-binding protein or ligand.

[3421] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 56939 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[3422] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 56939 or from a cell or subject in which a 56939 mediated response has been elicited, e.g., by contact of the cell with 56939 nucleic acid or protein, or administration to the cell or subject 56939 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 56939 (or does not express as highly as in the case of the 56939 positive plurality of capture probes) or from a cell or subject which in which a 56939 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 56939 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[3423] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 56939 or from a cell or subject in which a 56939-mediated response has been elicited, e.g., by contact of the cell with 56939 nucleic acid or protein, or administration to the cell or subject 56939 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 56939 (or does not express as highly as in the case of the 56939 positive plurality of capture probes) or from a cell or subject which in which a 56939 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[3424] In another aspect, the invention features a method of analyzing 56939, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 56939 nucleic acid or amino acid sequence; comparing the 56939 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 56939.

[3425] Detection of 56939 Variations or Mutations

[3426] The methods of the invention can also be used to detect genetic alterations in a 56939 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 56939 protein activity or nucleic acid expression, such as a metabolic disorder. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 56939-protein, or the mis-expression of the 56939 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 56939 gene; 2) an addition of one or more nucleotides to a 56939 gene; 3) a substitution of one or more nucleotides of a 56939 gene, 4) a chromosomal rearrangement of a 56939 gene; 5) an alteration in the level of a messenger RNA transcript of a 56939 gene, 6) aberrant modification of a 56939 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 56939 gene, 8) a non-wild type level of a 56939-protein, 9) allelic loss of a 56939 gene, and 10) inappropriate post-translational modification of a 56939-protein.

[3427] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 56939-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 56939 gene under conditions such that hybridization and amplification of the 56939-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[3428] In another embodiment, mutations in a 56939 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[3429] In other embodiments, genetic mutations in 56939 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 56939 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 56939 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 56939 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[3430] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 56939 gene and detect mutations by comparing the sequence of the sample 56939 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[3431] Other methods for detecting mutations in the 56939 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[3432] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 56939 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[3433] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 56939 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 56939 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[3434] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[3435] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[3436] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[3437] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 56939 nucleic acid.

[3438] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:48 or the complement of SEQ ID NO:48. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[3439] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 56939. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[3440] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the T_(m) of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[3441] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 56939 nucleic acid.

[3442] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 56939 gene.

[3443] Use of 56939 Molecules as Surrogate Markers

[3444] The 56939 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 56939 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 56939 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[3445] The 56939 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 56939 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-56939 antibodies may be employed in an immune-based detection system for a 56939 protein marker, or 56939-specific radiolabeled probes may be used to detect a 56939 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[3446] The 56939 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 56939 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 56939 DNA may correlate 56939 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[3447] Pharmaceutical Compositions of 56939

[3448] The nucleic acid and polypeptides, fragments thereof, as well as anti-56939 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[3449] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[3450] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[3451] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[3452] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[3453] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[3454] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[3455] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[3456] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[3457] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[3458] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[3459] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[3460] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[3461] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[3462] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[3463] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[3464] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, CC-1065, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine, vinblastine, taxol and maytansinoids). Radioactive ions include, but are not limited to iodine, yttrium and praseodymium.

[3465] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, α-interferon, β-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[3466] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[3467] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[3468] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[3469] Methods of Treatment for 56939

[3470] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 56939 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[3471] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 56939 molecules of the present invention or 56939 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[3472] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 56939 expression or activity, by administering to the subject a 56939 or an agent which modulates 56939 expression or at least one 56939 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 56939 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 56939 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 56939 aberrance, for example, a 56939, 56939 agonist or 56939 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[3473] It is possible that some 56939 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[3474] The 56939 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of disorders associated with bone metabolism, immune disorders, viral diseases, and pain or metabolic disorders.

[3475] Aberrant expression and/or activity of 56939 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 56939 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 56939 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 56939 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[3476] The 56939 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[3477] Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[3478] As discussed, successful treatment of 56939 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 56939 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)₂ and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[3479] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[3480] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[3481] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 56939 expression is through the use of aptamer molecules specific for 56939 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 56939 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[3482] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 56939 disorders. For a description of antibodies, see the Antibody section above.

[3483] In circumstances wherein injection of an animal or a human subject with a 56939 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 56939 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 56939 protein. Vaccines directed to a disease characterized by 56939 expression may also be generated in this fashion.

[3484] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[3485] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 56939 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[3486] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED₅₀ with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC₅₀ (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography. Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 56939 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 56939 can be readily monitored and used in calculations of IC₅₀.

[3487] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC₅₀. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[3488] Another aspect of the invention pertains to methods of modulating 56939 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 56939 or agent that modulates one or more of the activities of 56939 protein activity associated with the cell. An agent that modulates 56939 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 56939 protein (e.g., a 56939 substrate or receptor), a 56939 antibody, a 56939 agonist or antagonist, a peptidomimetic of a 56939 agonist or antagonist, or other small molecule.

[3489] In one embodiment, the agent stimulates one or 56939 activities. Examples of such stimulatory agents include active 56939 protein and a nucleic acid molecule encoding 56939. In another embodiment, the agent inhibits one or more 56939 activities. Examples of such inhibitory agents include antisense 56939 nucleic acid molecules, anti-56939 antibodies, and 56939 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 56939 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 56939 expression or activity. In another embodiment, the method involves administering a 56939 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 56939 expression or activity.

[3490] Stimulation of 56939 activity is desirable in situations in which 56939 is abnormally downregulated and/or in which increased 56939 activity is likely to have a beneficial effect. For example, stimulation of 56939 activity is desirable in situations in which a 56939 is downregulated and/or in which increased 56939 activity is likely to have a beneficial effect. Likewise, inhibition of 56939 activity is desirable in situations in which 56939 is abnormally upregulated and/or in which decreased 56939 activity is likely to have a beneficial effect.

[3491] 56939 Pharmacogenomics

[3492] The 56939 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 56939 activity (e.g., 56939 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 56939 associated disorders, e.g. metabolic disorders, associated with aberrant or unwanted 56939 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 56939 molecule or 56939 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 56939 molecule or 56939 modulator.

[3493] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[3494] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[3495] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 56939 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[3496] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 56939 molecule or 56939 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[3497] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 56939 molecule or 56939 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[3498] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 56939 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 56939 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent to which the unmodified target cells were resistant.

[3499] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 56939 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 56939 gene expression, protein levels, or upregulate 56939 activity, can be monitored in clinical trials of subjects exhibiting decreased 56939 gene expression, protein levels, or downregulated 56939 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 56939 gene expression, protein levels, or downregulate 56939 activity, can be monitored in clinical trials of subjects exhibiting increased 56939 gene expression, protein levels, or upregulated 56939 activity. In such clinical trials, the expression or activity of a 56939 gene, and preferably, other genes that have been implicated in, for example, a 56939-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[3500] 56939 Informatics

[3501] The sequence of a 56939 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 56939. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 56939 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[3502] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[3503] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[3504] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[3505] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[3506] Thus, in one aspect, the invention features a method of analyzing 56939, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 56939 nucleic acid or amino acid sequence; comparing the 56939 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 56939. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[3507] The method can include evaluating the sequence identity between a 56939 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[3508] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[3509] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[3510] Thus, the invention features a method of making a computer readable record of a sequence of a 56939 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[3511] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 56939 sequence, or record, in machine-readable form; comparing a second sequence to the 56939 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 56939 sequence includes a sequence being compared. In a preferred embodiment the 56939 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 56939 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[3512] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 56939-associated disease or disorder or a pre-disposition to a 56939-associated disease or disorder, wherein the method comprises the steps of determining 56939 sequence information associated with the subject and based on the 56939 sequence information, determining whether the subject has a 56939-associated disease or disorder or a pre-disposition to a 56939-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[3513] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 56939-associated disease or disorder or a pre-disposition to a disease associated with a 56939 wherein the method comprises the steps of determining 56939 sequence information associated with the subject, and based on the 56939 sequence information, determining whether the subject has a 56939-associated disease or disorder or a pre-disposition to a 56939-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 56939 sequence of the subject to the 56939 sequences in the database to thereby determine whether the subject as a 56939-associated disease or disorder, or a pre-disposition for such.

[3514] The present invention also provides in a network, a method for determining whether a subject has a 56939 associated disease or disorder or a pre-disposition to a 56939-associated disease or disorder associated with 56939, said method comprising the steps of receiving 56939 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 56939 and/or corresponding to a 56939-associated disease or disorder (e.g., metabolic disorders), and based on one or more of the phenotypic information, the 56939 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 56939-associated disease or disorder or a pre-disposition to a 56939-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[3515] The present invention also provides a method for determining whether a subject has a 56939-associated disease or disorder or a pre-disposition to a 56939-associated disease or disorder, said method comprising the steps of receiving information related to 56939 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 56939 and/or related to a 56939-associated disease or disorder, and based on one or more of the phenotypic information, the 56939 information, and the acquired information, determining whether the subject has a 56939-associated disease or disorder or a pre-disposition to a 56939-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[3516] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 33410 Invention

[3517] Higher eukaryotes have many distinct carboxylesterases. Among the different types of carboxylesterases are those that act on carboxylic esters. Carboxylesterases have been classified into three categories (A, B and C) on the basis of differential patters of inhibition by organophosphates (Myers, M. et al. (1988) Mol. Biol. Evol. 5(2):113-119). The sequence of a number of type-B carboxylesterase indicates that the majority are evolutionarily related. Members of the type B carboxylesterase include acetylcholincarboxylesterases from vertebrates and drosophila, mammalian cholincarboxylesterases, mammalian bile salt activated lipases, among others.

[3518] Neuroligins are cell surface molecules composed of five domains: an N-terminal cleaved signal sequence, a large extracellular domain homologous to carboxylesterases, a linker domain between the transmembrane region and the carboxylesterase homology domain, a single transmembrane region, and a cytoplasmic tail (Ichtchenko, K. et al. (1996) J. Biol. Chem. 271(5):2676-2682). Sequence comparisons place the neuroligins in the large family of carboxylesterase homology domain proteins that includes thyroglobulin, acetylcholincarboxylesterase, and gliotactin. However, neuroligins are only distantly related to these proteins, and thus appear to form a unique subset of the carboxylesterase family. At least three neuroligins have been cloned, namely Neuroligins 1, 2 and 3 (Ichtchenko, K. et al. (1995) Cell 81:435-443; Ichtchenko, K. et al. (1996) supra). These three neuroligins are expressed at high levels in the brain, primarily in neurons. Neuroligins have been shown to mediate cell adhesion events associated with neuronal development and/or maintenance. For example, Neuroligin 1 has been found to be enriched in postsynaptic densities where it may recruit receptors, channels, and signal transduction molecules at synaptic sites of cell adhesion (Song et al. (1999) Proc. Natl. Acad. Sci. 96(3):1100-5).

[3519] Functionally, neuroligins bind tightly, in a calcium-dependent manner, to the extracellular domains of the polymorphic cell surface proteins known as β-neurexins. Neurexins are neuronal cell surface proteins that exhibit a high degree of diversity (Ushkaryov et al. (1994) J. Biol. Chem. 269: 11987-11992). Neuroligin-β-neurexin interactions have been implicated in mediating recognition processes between neurons that give rise to neuronal developmental events such as synaptogenesis (e.g., specification of excitatory synapses) (Brose, N. (1999) Naturwissenschaften 86(11):516-24).

Summary of the 33410 Invention

[3520] The present invention is based, in part, on the discovery of a novel carboxylesterase family member, referred to herein as “33410”. The nucleotide sequence of a cDNA encoding 33410 is shown in SEQ ID NO:53, and the amino acid sequence of a 33410 polypeptide is shown in SEQ ID NO:54. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:55.

[3521] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 33410 protein or polypeptide, e.g., a biologically active portion of the 33410 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:54. In other embodiments, the invention provides isolated 33410 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:53, SEQ ID NO:55, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:53, SEQ ID NO:55, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under stringent hybridization conditions to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:53 or 55, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 33410 protein or an active fragment thereof.

[3522] In a related aspect, the invention further provides nucleic acid constructs that include a 33410 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 33410 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 33410 nucleic acid molecules and polypeptides.

[3523] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 33410-encoding nucleic acids.

[3524] In still another related aspect, isolated nucleic acid molecules that are antisense to a 33410 encoding nucleic acid molecule are provided.

[3525] In another aspect, the invention features 33410 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 33410-mediated or -related disorders, e.g., a cell adhesion disorder; a disorder involving aberrant cellular proliferation or differentiation (e.g., a cancer); a CNS disorder, such as a neurodegenerative disorder, e.g., Alzheimer's disease, dementias related to Alzheimer's disease (such as Pick's disease), Parkinson's and other Lewy diffuse body diseases, multiple sclerosis, amyotrophic lateral sclerosis, progressive supranuclear palsy, epilepsy, Jakob-Creutzfieldt disease, AIDS related dementia, familial infantile convulsions, paroxysmal choreoathetosis; a psychiatric disorder (e.g., depression, schizophrenic disorders, korsakoff's psychosis, mania, anxiety disorders, or phobic disorders); a learning or memory disorder (e.g., amnesia or age-related memory loss; and migraine).

[3526] In another embodiment, the invention provides 33410 polypeptides having a 33410 activity. Preferred polypeptides are 33410 proteins including at least one carboxylesterase domain, a signal peptide and at least one transmembrane domain, and, preferably, having a 33410 activity, e.g., a 33410 activity as described herein (e.g., the modulation of one or more of: a cell-cell (e.g., neuron-neuron, or neuron-glia) recognition event or adhesion; synaptogenesis; membrane excitability; neurite outgrowth; signal transduction; or cell (e.g., neural or a cancer cell) proliferation, growth, differentiation, or migration).

[3527] In other embodiments, the invention provides 33410 polypeptides, e.g., a 33410 polypeptide having the amino acid sequence shown in SEQ ID NO:54; the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:54; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under stringent hybridization conditions to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:53 or SEQ ID NO:55, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 33410 protein or an active fragment thereof.

[3528] In a related aspect, the invention further provides nucleic acid constructs that include a 33410 nucleic acid molecule described herein.

[3529] In a related aspect, the invention provides 33410 polypeptides or fragments operatively linked to non-33410 polypeptides to form fusion proteins.

[3530] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 33410 polypeptides.

[3531] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 33410 polypeptides or nucleic acids. In yet another aspect, the invention features a method of evaluating, or identifying, an agent, e.g., an agent as described herein, e.g., a compound (e.g., a polypeptide, peptide, a peptide fragment, a peptidomimetic, a small molecule), for the ability to modulate, e.g. inhibit, the activity or expression of a 33410 polypeptide. Such agents are useful for treating or preventing cardiovascular disorders (e.g., an endothelial cell disorder) or proliferation-related disorders, e.g., cancer, as described herein. The method includes:

[3532] providing a test agent, and a 33410 nucleic acid or polypeptide, or a cell expressing an 33410 (e.g., a cancer cell or cell line);

[3533] contacting said test agent, and said 33410 nucleic acid or polypeptide, or said cell expressing said 33410, under conditions that allow an interaction (e.g., activity or expression) between said 33410 nucleic acid or polypeptide and said test agent to occur; and

[3534] determining whether said test agent interacts with, e.g., binds the 33410 nucleic acid or polypeptide, or modulates, e.g., inhibits, the expression or activity of said 33410 polypeptide. E.g. wherein interacting with or binding the 33410 nucleic acid or polypeptide, or wherein a change, e.g., a decrease, in the level of activity or expression between said 33410 polypeptide in the presence of the test agent relative to the activity or expression in the absence of the test agent, is indicative of modulation, e.g., inhibition, of modulation of 33410 activity or expression.

[3535] In a preferred embodiment, the method further comprises the step of evaluating the test agent in the 33410-expressing cell, e.g., an endothelial or a cancer cell, in vitro, or in vivo (e.g., in a subject, e.g., a patient having a cancer or a cardiovascular disorder), to thereby determine the effect of the test agent in the activity or expression of the 33410.

[3536] In a preferred embodiment, the contacting step occurs in vitro or ex vivo. For example, a sample, e.g., a blood, biopsy or tissue sample, is obtained from the subject. Preferably, the sample contains a 33410-expressing cell.

[3537] In a preferred embodiment, the contacting step occurs in vivo. For example, by administering to the subject a detectably labeled agent that interacts with the 33410 nucleic acid or polypeptide, such that a signal is generated relative to the level of activity or expression of the 33410 nucleic acid or polypeptide.

[3538] In a preferred embodiment, the test agent is an inhibitor (partial or complete inhibitor) of the 33410 polypeptide activity or expression.

[3539] In preferred embodiments, the test agent is a peptide, a small molecule, e.g., a member of a combinatorial library (e.g., a peptide or organic combinatorial library, or a natural product library), or an antibody, or any combination thereof.

[3540] In additional preferred embodiments, the test agent is an antisense, a ribozyme, a triple helix molecule, or any combination thereof. In a preferred embodiment, a plurality of test agents, e.g., library members, is tested. In a preferred embodiment, the plurality of test agents, e.g., library members, includes at least 10, 10², 10³, 10⁴, 10⁵, 10⁶, 10⁷, or 10⁸ compounds. In a preferred embodiment, the plurality of test agents, e.g., library members, share a structural or functional characteristic.

[3541] In a preferred embodiment, test agent is a peptide or a small organic molecule. In a preferred embodiment, the method is performed in cell-free conditions (e.g., a reconstituted system).

[3542] In a preferred embodiment, the method further includes: contacting said agent with a test cell, or a test animal, to evaluate the effect of the test agent on the activity or expression of 33410.

[3543] In a preferred embodiment, the ability of the agent to modulate the activity or expression of 33410 is evaluated in a second system, e.g., a cell-free, cell-based, or an animal system.

[3544] In a preferred embodiment, the ability of the agent to modulate the activity or expression of 33410 is evaluated in a cell based system, e.g., a two-hybrid assay.

[3545] In still another aspect, the invention provides a process for modulating 33410 polypeptide or nucleic acid expression or activity, e.g. using one or more of the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 33410 polypeptides or nucleic acids, such as neurological conditions. For example, one or more of the screened compounds can be used to modulate one or more of cell-cell adhesion events, membrane excitability, neurite outgrowth, synaptogenesis, signal transduction, cell (e.g., neural cell) proliferation, growth, differentiation, or migration. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 33410 polypeptides or nucleic acids, such as neurodegenerative conditions, and aberrant or deficient cellular proliferation or differentiation.

[3546] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing, of a 33410-expressing cell, e.g., a 33410-expressing hyperproliferative cell, comprising contacting the cell with a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 33410 polypeptide or nucleic acid.

[3547] In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol.

[3548] In a preferred embodiment, the 33410-expressing cell is found in a solid tumor, a soft tissue tumor, or a metastatic lesion. Preferably, the tumor is a sarcoma, a carcinoma, or an adenocarcinoma. Preferably, the cell is found in a cancerous or pre-cancerous tissue, e.g., a cancerous or pre-cancerous tissue where a 33410 polypeptide or nucleic acid is expressed. In a preferred embodiment, the 33410-expressing cell is an endothelial cell, e.g., a blood vessel associated cell.

[3549] In a preferred embodiment, the compound is an inhibitor of a 33410 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a peptidomimetic, e.g., a phosphonate analog of a peptide substrate, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion).

[3550] In a preferred embodiment, the compound is an inhibitor of a 33410 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[3551] In a preferred embodiment, the compound is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[3552] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant cellular proliferation or differentiation of a 33410-expressing cell, in a subject. Preferably, the method includes comprising administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 33410 polypeptide or nucleic acid.

[3553] In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition. Most preferably, the disorder is a cancer, e.g., a solid tumor, a soft tissue tumor, or a metastatic lesion. Preferably, the cancer is a sarcoma, a carcinoma, or an adenocarcinoma. Preferably, the cancer is found in a tissue where a 33410 polypeptide or nucleic acid is expressed, e.g., breast, ovarian, colon, liver, lung, kidney, or brain cancer. Most preferably, the cancer is found in the breast, ovary, colon, liver and lung.

[3554] In a preferred embodiment, the disorder is an endothelial cell disorder; is a disorder characterized by aberrant, unregulated, or unwanted endothelial cell activity, e.g., proliferation, migration, angiogenesis, or vascularization; or aberrant expression of cell surface adhesion molecules or genes associated with angiogenesis. Examples of endothelial cell disorders include tumorigenesis, tumor metastasis, psoriasis, diabetic retinopathy, endometriosis, Grave's disease, ischemic disease (e.g., atherosclerosis), and chronic inflammatory diseases (e.g., rheumatoid arthritis).

[3555] In a preferred embodiment, the compound is an inhibitor of a 33410 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). The inhibitor can also be a trypsin inhibitor or a derivative thereof, or a peptidomimetic, e.g., a phosphonate analog of a peptide substrate.

[3556] In a preferred embodiment, the compound is an inhibitor of a 33410 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[3557] In a preferred embodiment, the compound is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[3558] In a preferred embodiment, the subject is a mammal, e.g., a human; a patient, e.g., a patient with a cancer or a cardiovascular condition.

[3559] The invention also provides assays for determining the activity of or the presence or absence of 33410 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis. Preferably, the biological sample includes a cancerous or pre-cancerous cell or tissue. For example, the cancerous tissue can be a solid tumor, a soft tissue tumor, or a metastatic lesion. Preferably, the cancerous tissue is a sarcoma, a carcinoma, or an adenocarcinoma. In other embodiments, the biological sample includes endothelial cells.

[3560] In further aspect the invention provides assays for determining the presence or absence of a genetic alteration in a 33410 polypeptide or nucleic acid molecule, including for disease diagnosis. Preferably, the biological sample includes a cancerous or pre-cancerous cell or tissue. For example, the cancerous tissue can be a solid tumor, a soft tissue tumor, or a metastatic lesion. Preferably, the cancerous tissue is a sarcoma, a carcinoma, or an adenocarcinoma. In other embodiments, the biological sample includes endothelial cells.

[3561] In another aspect, the invention features a method of diagnosing, or staging, a 33410-mediated disorder, e.g., a neurological disorder, or a cancer disorder, in a subject. The method includes evaluating the expression or activity of a 33410 nucleic acid or polypeptide, thereby diagnosis or staging the disorder. In a preferred embodiment, the expression or activity is compared with a reference value, e.g., a difference in the expression or activity level of the 33410 nucleic or polypeptide relative to a reference, e.g., a normal subject or a cohort of normal subjects is indicative of the disorder, or a stage in the disorder.

[3562] In a preferred embodiment, the subject is a human. For example, the subject is a human suffering from, or at risk of, a cardiovascular or a cancer disorder as described herein.

[3563] In a preferred embodiment, the evaluating step occurs in vitro or ex vivo. For example, a sample, e.g., a blood or tissue sample, a biopsy, is obtained from the subject. Preferably, the sample contains a cancer or an endothelial cell.

[3564] In a preferred embodiment, the evaluating step occurs in vivo. For example, by administering to the subject a detectably labeled agent that interacts with the 33410-associated nucleic acid or polypeptide, such that a signal is generated relative to the level of activity or expression of the 33410 nucleic acid or polypeptide.

[3565] In preferred embodiments, the method is performed: on a sample from a subject, a sample from a human subject; e.g., a sample of a patient suffering from, or at risk of, a cardiovascular or a cancer disorder as described herein; to determine if the individual from which the target nucleic acid or protein is taken should receive a drug or other treatment; to diagnose an individual for a disorder or for predisposition to resistance to treatment, to stage a disease or disorder.

[3566] In a still further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., proliferative disorder, e.g., a cancer (e.g., breast, ovarian, colon, liver or lung cancer); or an endothelial cell disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 33410 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 33410 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder.

[3567] In a preferred embodiment, the disorder is a cancer of the breast, ovary, colon, lung, or liver. In other embodiments, the disorder is an endothelial cell disorder. The level of 33410 nucleic acid or polypeptide expression can be detected by any method described herein.

[3568] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 33410 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[3569] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression or activity of a 33410 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 33410 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 33410 nucleic acid or polypeptide expression can be detected by any method described herein.

[3570] In a preferred embodiment, the sample includes cells obtained from a cancerous tissue where a 33410 polypeptide or nucleic acid is obtained, e.g., a cancer of the breast, ovary, colon, lung, or liver.

[3571] In a preferred embodiment, the sample is a tissue sample (e.g., a biopsy), a bodily fluid, cultured cells (e.g., a tumor cell line).

[3572] In a preferred embodiment, the sample includes endothelial cells.

[3573] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 33410 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 33410 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 33410 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[3574] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

[3575] Detailed Description of 33410

[3576] The human 33410 sequence (Example 39; SEQ ID NO:53), which is approximately 4667 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 2508 nucleotides, including the termination codon (nucleotides indicated as coding of SEQ ID NO:53; SEQ ID NO:55). The coding sequence encodes a 835 amino acid protein (SEQ ID NO:54). The human 33410 protein of SEQ ID NO:54 and FIG. 27 includes an amino-terminal hydrophobic amino acid sequence, consistent with a signal sequence, of about 14 amino acids (from amino acid 1 to about amino acid 14 of SEQ ID NO:54), which upon cleavage results in the production of a mature protein form.

[3577] Human 33410 contains the following regions or other structural features: a carboxylesterase domain (PFAM Accession Number PF00135) located at about amino acid residues 42 to 601 of SEQ ID NO:54 which includes a carboxylesterase type-B signature 2 domain (Prosite Accession Number PS00941) at about amino acid 139 to 149 of SEQ ID NO:54; and one predicted transmembrane domain at about amino acid 676 to 698 of SEQ ID NO:54.

[3578] The 33410 protein also includes the following domains: four predicted N-glycosylation sites (PS00001) at about amino acids 98 to 101, 136 to 139, 522 to 525, and 823 to 826 of SEQ ID NO:54; three predicted glycosaminoglycan attachment sites (PS00002) at about amino acids 264 to 267, 718 to 721 and 793 to 796 of SEQ ID NO:54; three predicted cAMP- and cGMP-dependent protein kinase phosphorylation sites (PS00004) at about amino acids 331 to 334, 447 to 450, and 710 to 713 of SEQ ID NO:54; six predicted Protein Kinase C phosphorylation sites (PS00005) at about amino acids 156 to 158, 430 to 432, 467 to 469, 619 to 621, 640 to 642, and 832 to 834 of SEQ ID NO:54; five predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino 35 to 38, 187 to 190, 223 to 226, 404 to 407, and 524 to 527 of SEQ ID NO:54; one predicted Tyrosine kinase phosphorylation site (PS00007) at about amino acids 664 to 672 of SEQ ID NO:54; fourteen predicted N-myristoylation sites (PS00008) from about amino 10 to 15, 18 to 23, 30 to 35, 71 to 76, 95 to 100, 112 to 117, 198 to 203, 263 to 268, 292 to 297, 381 to 386, 400 to 405, 636 to 641, 716 to 721, and 750 to 755 of SEQ ID NO:54; one predicted amidation site (PS00009) at about amino acid 174 to 177 of SEQ ID NO:54; and one predicted prokaryotic membrane lipoprotein lipid attachment site (PS00013) at about amino acid 260 to 270 of SEQ ID NO:54.

[3579] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[3580] A plasmid containing the nucleotide sequence encoding human 33410 (clone Fbh33410FL) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[3581] The 33410 protein contains a significant number of structural characteristics in common with members of the carboxylesterase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics. Carboxylesterase family members are known to act on carboxylic esters, such as acetylcholinesterase. Based on the differential patters of inhibition by organophosphates, carboxylesterases have been classified into three categories (A, B and C) (Myers, M. et al. (1988) Mol. Biol. Evol. 5(2):113-119). 33410 proteins of the invention include a carboxylesterase type B signature 2 domain located at about amino acids 139 to 149 of SEQ ID NO:54, which suggests that the 33410 proteins belong to the carboxylesterase type B family.

[3582] Carboxylesterase family members are characterized by a catalytic triad of amino acids: a serine, a glutamate or aspartate and a histidine. The sequence around the active site serine is well conserved and can be used as a signature pattern. A second signature pattern is located in the N-terminal section and contains a cysteine involved in disulfide bond formation. Typical consensus patterns of carboxylesterases are F-[GR]-G-x(4)-[LIVM]-x-[LIV]-x-G-x-S-[STAG]-G (SEQ I]D NO:59) (where S is the active site residue) and [ED]-D-C-L-[YT]-[LIV]-[DNS]-[LIV]-[LIVFYW]-x-[PQR] (where C is involved in a disulfide bond). 33410 proteins have a similar pattern starting at about amino acid 252 to 267 of SEQ ID NO:54 as follows: FGGDPERITIFGSGAG.

[3583] 33410 proteins of the invention are homologous to rat neuroligin proteins, in particular, the rat neuroligin-2 protein (FIG. 29). Thus, the proteins of the invention are members of the neuroligin subfamily of carboxylesterase type B proteins. As used herein, the term “neuroligin” refers to cell surface molecules composed of five domains: an N-terminal cleaved signal sequence, a large extracellular domain homologous to esterases, a linker domain between the transmembrane region and the esterase homology domain, a single transmembrane region, and a cytoplasmic tail (Ichtchenko, K. et al. (1996) J. Biol. Chem. 271(5):2676-2682). Neurexin binding is known to be calcium-dependent and is mediated by an EF hand motif. 33410 contain several structural features of neuroligin family members. For example, 33410 proteins have an extracellular domain located at about amino acids 1-675 of SEQ ID NO:54 (which includes a carboxylesterase domain located at amino acids 42 to 601 of SEQ ID NO:54), a transmembrane domain located at about amino acids 676-698, and a short cytoplasmic domain located at about amino acids 699-835 of SEQ ID NO:54. 33410 proteins further include an EF hand motif from about amino acid 387 to 416 of SEQ ID NO:54, and six conserved cysteine residues located at about amino acids 106, 141, 270, 317, 328, 487 and 521 of SEQ ID NO:54.

[3584] Typically, members of the neuroligins are expressed at high levels in the brain, primarily in neurons. Typically, neuroligins are capable of mediating cell adhesion events associated with development and/or maintenance, e.g., neural events such as synaptogenesis, recruitment of receptors, channels, and signal transduction molecules at synaptic sites (e.g., at excitatory synapses) (Song et al. (1999) Proc. Natl. Acad. Sci. 96(3):1100-5. Typically, neuroligins are capable of interacting with a cell surface protein, e.g., a neurexin (e.g., a β-neurexins). Typically, neuroligin-β-neurexin interactions mediate cell adhesion events, e.g., neuron-neuron, or neuron-glia cell adhesion events.

[3585] A 33410 polypeptide can include a “carboxylesterase domain” or regions homologous with an “carboxylesterase” domain. A 33410 can optionally further include at least one transmembrane domain, at least one extracellular domain, and at least one intracellular domain. A 33410 can optionally further include at least one, two, three, preferably four N-glycosylation sites; at least one, two preferably three cAMP/cGMP phosphorylation sites; at least one, two, three, four, five, preferably six protein kinase C sites; at least one, two, three, four, preferably five casein kinase II sites; at least one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, and preferably fourteen N-myristoylation sites; at least one tyrosine phosphorylation site, at least one amidation site; and at least one, two and preferably three glycosaminoglycan attachment site.

[3586] As used herein, the term “carboxylesterase domain” refers to a protein domain which is includes a carboxylesterase type B signature 2 domain. Preferably, the carboxylesterase type B signature 2 domain is about 5 to 20 amino acids, more preferably 8-15, most preferably 11 amino acids and includes the sequence [EDX(0, 1)CLYX] (SEQ ID NO:60). Most preferably, the carboxylesterase type B signature 2 domain has the amino acid sequence: EDCLYNIYVP located at about amino acids 139 to 149 of SEQ ID NO:54. Preferably, the carboxylesterase domain has an amino acid sequence of about 450 to about 650 amino acid residues and having a bit score for the alignment of the sequence to the carboxylesterase domain (HMM) of at least 100. Preferably, a carboxylesterase domain includes at least about 450 to about 600 amino acids, more preferably about 500 to about 575 amino acid residues, about 550 to 570, or about 559 amino acids and has a bit score for the alignment of the sequence to the carboxylesterase domain (HMM) of at least 200, preferably 300, more preferably 400 or greater. The carboxylesterase domain (HMM) has been assigned the PFAM Accession (PF00135) (http://genome.wustl.edu/Pfam/html). An alignment of the carboxylesterase domain (from about amino acids 42 to about 601 of SEQ ID NO:54) of human 33410 with a consensus amino acid sequence derived from a hidden Markov model (PFAM) is depicted in FIG. 28.

[3587] In a preferred embodiment, 33410 polypeptide or protein has a “carboxylesterase domain” or a region which includes at least about 400 to about 650 amino acids, preferably 450 to about 600 amino acids, more preferably about 500 to about 570 amino acid residues, about 550 to 570, or about 559 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “carboxylesterase domain,” e.g., the carboxylesterase domain of human 33410 (e.g., residues 42 to 601 of SEQ ID NO:54).

[3588] To identify the presence of an “carboxylesterase” domain in a 33410 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of an “carboxylesterase domain” domain in the amino acid sequence of human 33410 at about residues 42 to 601 of SEQ ID NO:54 (see FIG. 26).

[3589] In one embodiment, a 33410 protein includes at least one transmembrane domain. As used herein, the term “transmembrane domain” includes an amino acid sequence of about 15 amino acid residues in length that spans a phospholipid membrane. More preferably, a transmembrane domain includes about at least 16, 18, 20, 21, 23, 25, 30, 35 or 40 amino acid residues and spans a phospholipid membrane. Transmembrane domains are rich in hydrophobic residues, and typically have an α-helical structure. In a preferred embodiment, at least 50%, 60%, 70%, 80%, 90%, 95% or more of the amino acids of a transmembrane domain are hydrophobic, e.g., leucines, isoleucines, tyrosines, or tryptophans. Transmembrane domains are described in, for example, http://pfam.wustl.edu/cgi-bin/getdesc?name=7tm-1, and Zagotta W. N. et al, (1996) Annual Rev. Neuronsci. 19: 235-63, the contents of which are incorporated herein by reference.

[3590] In a preferred embodiment, 33410 polypeptide or protein has a “transmembrane domain” or a region which includes at least about 1 to 45, more preferably about 10 to 35, even more preferably about 20 to 25 or 23 amino acid residues and has at least about 50%, 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “transmembrane domain” e.g., the transmembrane domain of human 33410 (e.g., residues 676 to 698 of SEQ ID NO:54).

[3591] A 33410 protein further includes a predicted N-terminal extracellular domain located at about amino acids 1-675 (or 15-675 of the mature protein) of SEQ ID NO:54. As used herein, an “N-terminal extracellular domain” includes an amino acid sequence about 1-800, preferably about 200-700, and even more preferably about 300-680 or 675, amino acid residues in length and is located outside of a cell or extracellularly. The C-terminal amino acid residue of a “N-terminal extracellular domain” is adjacent to an N-terminal amino acid residue of a transmembrane domain in a naturally occurring 33410 or 33410-like protein. For example, an N-terminal cytoplasmic domain is located at about amino acid residues 1-675 of SEQ ID NO:54.

[3592] In a preferred embodiment 33410 polypeptide or protein has an “N-terminal extracellular domain” or a region which includes at least about 1-800, preferably about 200-700, and even more preferably about 300-680 or 675 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with an “N-terminal extracellular domain,” e.g., the N-terminal extracellular domain of human 33410 (e.g., residues 1-675 of SEQ ID NO:54). Preferably, the N-terminal extracellular domain is capable of interacting (e.g., binding to) with an extracellular signal (e.g., a neurexin) and/or modulating cell adhesion.

[3593] In another embodiment, a 33410 protein includes a “C-terminal cytoplasmic domain”, also referred to herein as a C-terminal cytoplasmic tail, in the sequence of the protein. As used herein, a “C-terminal cytoplasmic domain” includes an amino acid sequence having a length of at least about 50 to 200, preferably 100 to 150, and more preferably 136 amino acid residues and is located within a cell or within the cytoplasm of a cell. Accordingly, the N-terminal amino acid residue of a “C-terminal cytoplasmic domain” is adjacent to a C-terminal amino acid residue of a transmembrane domain in a naturally-occurring 33410 or 33410-like protein. For example, a C-terminal cytoplasmic domain is found at about amino acid residues 699-835 of SEQ ID NO:54.

[3594] In a preferred embodiment, a 33410 polypeptide or protein has a C-terminal cytoplasmic domain or a region which includes at least about 50 to 200, preferably 100 to 150, and more preferably 136 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with an “C-terminal cytoplasmic domain,” e.g., the C-terminal cytoplasmic domain of human 33410 (e.g., residues 699-835 of SEQ ID NO:54).

[3595] A 33410 molecule can further include a signal sequence. As used herein, a “signal sequence” refers to a peptide of about 10-40 amino acid residues in length which occurs at the N-terminus of secretory and integral membrane proteins and which contains a majority of hydrophobic amino acid residues. For example, a signal sequence contains at least about 12-30 amino acid residues, preferably about 13-20 amino acid residues, more preferably about 14 amino acid residues, and has at least about 40-70%, preferably about 50-65%, and more preferably about 55-60% hydrophobic amino acid residues (e.g., alanine, valine, leucine, isoleucine, phenylalanine, tyrosine, tryptophan, or proline). Such a “signal sequence”, also referred to in the art as a “signal peptide”, serves to direct a protein containing such a sequence to a lipid bilayer. For example, in one embodiment, a 33410 protein contains a signal sequence of about amino acids 1-14 of SEQ ID NO:54. The “signal sequence” is cleaved during processing of the mature protein. The mature 33410 protein corresponds to amino acids 15 to 835 of SEQ ID NO:54.

[3596] As the 33410 polypeptides of the invention may modulate 33410-mediated activities, they may be useful as or for developing novel diagnostic and therapeutic agents for 33410-mediated or related disorders, as described below.

[3597] As used herein, a “33410 activity”, “biological activity of 33410” or “functional activity of 33410”, refers to an activity exerted by a 33410 protein, polypeptide or nucleic acid molecule on e.g., a 33410-responsive cell or on a 33410 substrate, e.g., a protein substrate, as determined in vivo or in vitro. In one embodiment, a 33410 activity is a direct activity, such as an association with a 33410 target molecule. A “target molecule” or “binding partner” is a molecule with which a 33410 protein binds or interacts in nature, e.g., a cell surface molecule, e.g., a neurexin. A 33410 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 33410 protein with a 33410 receptor. For example, the 33410 proteins of the present invention can have one or more of the following activities: 1) ability to catalyze the hydrolysis of carboxylic esters; (2) ability to mediate cell-cell (e.g., neuron-neuron, or neuron-glia) recognition events, adhesion or attachment (3) ability to interact with a cell surface protein (e.g., neurexin) or an extracellular component; (4) ability to modulate cell migration, (5) ability to modulate patterning, (6) ability to modulate proliferation, and/or differentiation, of a cell (e.g., a neural or a cancer cell); (7) ability to modulate embryonic development and differentiation; (8) ability to modulate morphogenesis; (9) ability to modulate tissue maintenance; (10) ability to modulate neural development, e.g., axonal growth, synaptogenesis, neurite outgrowth, membrane excitability and/or guidance; or (11) ability to bind a divalent cation, e.g., Zn²⁺, Mg²⁺, Cd²⁺, Mn²⁺, and/or preferably a Ca2+ ion.

[3598] Based on the above-described sequence similarities, the 33410 molecules of the present invention are predicted to have similar biological activities as carboxylesterase family members, in particular neuroligin proteins. Thus, the 33410 molecules can act as novel diagnostic targets and therapeutic agents for controlling cell proliferative and cell differentiative disorders, as well as neural disorder (e.g., neurodegenerative disorders including CNS disorders).

[3599] The 33410 protein may be involved in disorders characterized by aberrant activity of the cells in which it is expressed. 33410 is expressed in cells and tissues derived from heart, arteries, kidney, brain (e.g., cortex and hypothalamus), spinal cord and ovaries (Table 16). Accordingly, the 33410 molecules can serve as novel diagnostic targets and therapeutic agents for controlling disorders involving the cells or tissues where they are expressed. For example, the 33410 molecules can serve as novel diagnostic targets and therapeutic agents for controlling disorders of cell proliferation, cell differentiation, angiogenesis, organogenesis, and cell signaling.

[3600] Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B₁) deficiency and vitamin B₁₂ deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[3601] The polypeptides and nucleic acids of the invention can also be used to treat, prevent, and/or diagnose cancers and neoplastic conditions. Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[3602] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth, i.e., an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[3603] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[3604] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[3605] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[3606] Examples of cancers or neoplastic conditions, include, but are not limited to, a fibrosarcoma, myosarcoma, liposarcoma, chondrosarcoma, osteogenic sarcoma, chordoma, angiosarcoma, endotheliosarcoma, lymphangiosarcoma, lymphangioendotheliosarcoma, synovioma, mesothelioma, Ewing's tumor, leiomyosarcoma, rhabdomyosarcoma, gastric cancer, esophageal cancer, rectal cancer, pancreatic cancer, ovarian cancer, prostate cancer, uterine cancer, cancer of the head and neck, skin cancer, brain cancer, squamous cell carcinoma, sebaceous gland carcinoma, papillary carcinoma, papillary adenocarcinoma, cystadenocarcinoma, medullary carcinoma, bronchogenic carcinoma, renal cell carcinoma, hepatoma, bile duct carcinoma, choriocarcinoma, seminoma, embryonal carcinoma, Wilm's tumor, cervical cancer, testicular cancer, small cell lung carcinoma, non-small cell lung carcinoma, bladder carcinoma, epithelial carcinoma, glioma, astrocytoma, medulloblastoma, craniopharyngioma, ependymoma, pinealoma, hemangioblastoma, acoustic neuroma, oligodendroglioma, meningioma, melanoma, neuroblastoma, retinoblastoma, leukemia, lymphoma, or Kaposi sarcoma.

[3607] Examples of cellular proliferative and/or differentiative disorders of the breast include, but are not limited to, proliferative breast disease including, e.g., epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors, e.g., stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget 's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[3608] Examples of cellular proliferative and/or differentiative disorders of the lung include, but are not limited to, bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[3609] Examples of cellular proliferative and/or differentiative disorders of the colon include, but are not limited to, non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors. Examples of cellular proliferative and/or differentiative disorders of the liver include, but are not limited to, nodular hyperplasias, adenomas, and malignant tumors, including primary carcinoma of the liver and metastatic tumors.

[3610] Examples of cellular proliferative and/or differentiative disorders of the ovary include, but are not limited to, ovarian tumors such as, tumors of coelomic epithelium, serous tumors, mucinous tumors, endometeriod tumors, clear cell adenocarcinoma, cystadenofibroma, brenner tumor, surface epithelial tumors; germ cell tumors such as mature (benign) teratomas, monodermal teratomas, immature malignant teratomas, dysgerminoma, endodermal sinus tumor, choriocarcinoma; sex cord-stomal tumors such as, granulosa-theca cell tumors, thecoma-fibromas, androblastomas, hill cell tumors, and gonadoblastoma; and metastatic tumors such as Krukenberg tumors.

[3611] The 33410 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of hematopoietic neoplastic disorders. Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin, e.g., arising from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[3612] As the 33410 mRNA is expressed in the normal heart, artery, spinal cord, kidney, brain, and ovary, it is likely that 33410 molecules of the present invention are involved in disorders characterized by aberrant activity of these cells. Thus, the 33410 molecules can act as novel diagnostic targets and therapeutic agents for controlling disorders involving aberrant activity of these cells. For example, modulators of 33410 polypeptide or nucleic acid activity or expression can be used to treat or prevent endothelial cell disorders, and more broadly cardiovascular or blood vessel disorders.

[3613] The 33410 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:54 thereof are collectively referred to as “polypeptides or proteins of the invention” or “33410 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “33410 nucleic acids.” 33410 molecules refer to 33410 nucleic acids, polypeptides, and antibodies.

[3614] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA) and RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA generated, e.g., by the use of nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[3615] The term “isolated or purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules that are present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules that are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences that naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[3616] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and non-aqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[3617] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under stringent conditions to the sequence of SEQ ID NO:53 or 55, corresponds to a naturally-occurring nucleic acid molecule.

[3618] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature (e.g., encodes a natural protein).

[3619] As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include an open reading frame encoding a 33410 protein, preferably a mammalian 33410 protein, and can further include non-coding regulatory sequences, and introns.

[3620] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. In one embodiment, the language “substantially free” means preparation of 33410 protein having less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-33410 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-33410 chemicals. When the 33410 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[3621] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 33410 (e.g., the sequence of SEQ ID NO:53 or 55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______) without abolishing or more preferably, without substantially altering a biological activity, whereas an “essential” amino acid residue results in such a change. For example, amino acid residues that are conserved among the polypeptides of the present invention, e.g., those present in the carboxylesterase domain, are predicted to be particularly unamenable to alteration.

[3622] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 33410 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 33410 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 33410 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:53 or SEQ ID NO:55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[3623] As used herein, a “biologically active portion” of a 33410 protein includes a fragment of a 33410 protein that participates in an interaction between a 33410 molecule and a non-33410 molecule. Biologically active portions of a 33410 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 33410 protein, e.g., the amino acid sequence shown in SEQ ID NO:54, which include less amino acids than the full length 33410 proteins, and exhibit at least one activity of a 33410 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 33410 protein, e.g., (1) binding to cell surface ligand, e.g., neurexins; (2) acting as a cell surface receptor; (3) possessing cell adhesion properties; (4) mediating cell-cell interactions between, e.g., neurons; (5) regulating inter-neuronal recognition pathways for axon pathfinding; (6) regulating neuritogenesis; or (7) binding of a divalent cation, e.g., Zn²⁺, Ca²⁺, Mg²⁺, Cd²⁺ and/or Mn²⁺. A biologically active portion of a 33410 protein can be a polypeptide that is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 33410 protein can be used as targets for developing agents that modulate a 33410 mediated activity, e.g., an activity as described herein.

[3624] Particular 33410 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:54. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:54 are termed substantially identical.

[3625] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:53 or 55 are termed substantially identical.

[3626] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[3627] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, even more preferably at least 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence (e.g., when aligning a second sequence to the 33410 amino acid sequence of SEQ ID NO:54 having 560 amino acid residues, at least [30%] 168, preferably at least [40%] 224, more preferably at least [50%] 280, even more preferably at least [60%] 336, and even more preferably at least [70%] 392, [80%] 448, or [90%] 504 amino acid residues are aligned). The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”). The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[3628] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch (J. Mol. Biol. (48):444-453 (1970)) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used if the practitioner is uncertain about what parameters should be applied to determine if a molecule is within a sequence identity or homology limitation of the invention) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[3629] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller (CABIOS, 4:11-17 (1989)) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[3630] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 33410 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 33410 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25(17):3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[3631] “Misexpression or aberrant expression”, as used herein, refers to a non-wild type pattern of gene expression, at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over or under expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of decreased expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[3632] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[3633] A “purified preparation of cells”, as used herein, refers to, in the case of plant or animal cells, an in vitro preparation of cells and not an entire intact plant or animal. In the case of cultured cells or microbial cells, it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[3634] Various aspects of the invention are described in further detail below.

[3635] Isolated Nucleic Acid Molecules of 33410

[3636] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 33410 polypeptide described herein, e.g., a full-length 33410 protein or a fragment thereof, e.g., a biologically active portion of 33410 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to a identify nucleic acid molecule encoding a polypeptide of the invention, 33410 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[3637] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:53, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 33410 protein (i.e., “the coding region” of SEQ ID NO:53, as shown in SEQ ID NO:55), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:53 (e.g., SEQ ID NO:55) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein from about amino acid 42 to 601 of SEQ ID NO:54.

[3638] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:53 or SEQ ID NO:55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:53 or SEQ ID NO:55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ such that it can hybridize to the nucleotide sequence shown in SEQ ID NO:53 or 55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, thereby forming a stable duplex.

[3639] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:53 or SEQ ID NO:55, or the entire length of the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, or a portion, preferably of the same length, of any of these nucleotide sequences.

[3640] 33410 Nucleic Acid Fragments

[3641] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:53 or 55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______. For example, such a nucleic acid molecule can include a fragment that can be used as a probe or primer or a fragment encoding a portion of a 33410 protein, e.g., an immunogenic or biologically active portion of a 33410 protein. A fragment can comprise those nucleotides of SEQ ID NO:53, which encode an carboxylesterase domain of human 33410. The nucleotide sequence determined from the cloning of the 33410 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 33410 family members, or fragments thereof, as well as 33410 homologues, or fragments thereof, from other species.

[3642] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment that includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 800 amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[3643] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 33410 nucleic acid fragment can include a sequence corresponding to a carboxylesterase domain.

[3644] In a preferred embodiment, the fragment is at least 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, or 1000 nucleotides in length.

[3645] 33410 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under stringent conditions to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:53 or SEQ ID NO:55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, or of a naturally occurring allelic variant or mutant of SEQ ID NO:53 or SEQ ID NO:55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______.

[3646] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[3647] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid that encodes a carboxylesterase domain from about amino acid 42 to 601 of SEQ ID NO:54, a carboxylesterase type-B signature 2 domain from about amino acid 139 to 149 of SEQ ID NO:54, or a transmembrane domain from about amino acid 676 to 698 of SEQ ID NO:54.

[3648] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 33410 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a carboxylesterase domain from about amino acid 42 to 601 of SEQ ID NO:54; a carboxylesterase type-B signature 2 domain from about amino acid 139 to 149 of SEQ ID NO:54; or a transmembrane domain from about amino acid 676 to 698 of SEQ ID NO:54.

[3649] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[3650] A nucleic acid fragment encoding a “biologically active portion of a 33410 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:53 or 55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, which encodes a polypeptide having a 33410 biological activity (e.g., the biological activities of the 33410 proteins are described herein), expressing the encoded portion of the 33410 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 33410 protein. For example, a nucleic acid fragment encoding a biologically active portion of 33410 includes a carboxylesterase domain, e.g., amino acid residues about 42 to 601 of SEQ ID NO:54. A nucleic acid fragment encoding a biologically active portion of a 33410 polypeptide, may comprise a nucleotide sequence which is greater than 300 or more nucleotides in length.

[3651] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of Genbank accession number AW291374. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 927-1424; not include all of the nucleotides of AW291374, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of AW291374; or can differ by one or more nucleotides in the region of overlap.

[3652] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of Genbank accession number A1337820. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 804-1331; not include all of the nucleotides of AI337820, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of A1337820; or can differ by one or more nucleotides in the region of overlap.

[3653] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of Genbank accession number AB037787. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 1273-4641; not include all of the nucleotides of AB037787, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of AB037787; or can differ by one or more nucleotides in the region of overlap.

[3654] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of Genbank accession number C74943. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 1422-2345; not include all of the nucleotides of C74943, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of C74943; or can differ by one or more nucleotides in the region of overlap.

[3655] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of Genbank accession number V59639. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 3758-4297; not include all of the nucleotides of V59639, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of V59639; or can differ by one or more nucleotides in the region of overlap.

[3656] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of Genbank accession number AAA97870. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 420-2372; not include all of the nucleotides of AAA97870, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of AAA97870; or can differ by one or more nucleotides in the region of overlap.

[3657] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of Genbank accession number BAA92604. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 1275-2924; not include all of the nucleotides of BAA9260, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of BAA9260; or can differ by one or more nucleotides in the region of overlap.

[3658] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from a sequence in WO 01/27277.7. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 420-2320; not include all of the nucleotides of a sequence in WO 01/27277.7, e.g., can be one or more nucleotides shorter (at one or both ends) than a sequence in WO 01/27277.7; or can differ by one or more nucleotides in the region of overlap.

[3659] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from a sequence in WO 01/27277.8. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 420-2372; not include all of the nucleotides of a sequence in WO 01/27277.8, e.g., can be one or more nucleotides shorter (at one or both ends) than a sequence in WO 01/27277.8; or can differ by one or more nucleotides in the region of overlap.

[3660] In a preferred embodiment, a nucleic acid fragment includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 2000, 2100, 2200, 2300, 2400, 2500, 2600, 2700, 2800, 2900, 3000, 3100, 3200, 3300, 3400, 3500, 3600, 3700, 3800, 3900, 4000 or more nucleotides in length and hybridizes under stringent hybridization conditions to a nucleic acid molecule of SEQ ID NO:53, or SEQ ID NO:55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______.

[3661] 33410 Nucleic Acid Variants

[3662] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:53 or SEQ ID NO:55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid that encodes the same 33410 proteins as those encoded by the nucleotide sequence disclosed herein). In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:54. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[3663] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, and preferably at least 10%, or 20% of the codons have been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[3664] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non-naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[3665] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:53 or 55, or the sequence in ATCC Accession Number ______, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 2%, 5%, 10% or 20% of the in the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[3666] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO:54 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under stringent conditions, to the nucleotide sequence shown in SEQ ID NO:54 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 33410 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 33410 gene.

[3667] Preferred variants include those that are correlated with modulating (stimulating and/or enhancing or inhibiting) cell adhesion, cellular proliferation, differentiation, or tumorigenesis; modulating neural cell activity; binding to neurexins on neurons; acting as a cell surface receptor; mediating cell-cell interactions between neurons; regulating inter-neuronal recognition pathways for axon pathfinding; regulating neuritogenesis; or binding divalent cations, e.g., Zn²⁺, Ca²⁺, Mg²⁺, Cd²⁺ and/or Mn²⁺.

[3668] Allelic variants of 33410, e.g., human 33410, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 33410 protein within a population that maintain the ability to bind an extracellular component or cell surface protein, e.g., neurexin. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:54, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 33410, e.g., human 33410, protein within a population that do not have the ability to bind to neurexins; act as a cell surface receptor; possess cell adhesion properties; mediate cell-cell interactions between neurons; regulate inter-neuronal recognition pathways for axon pathfinding; regulate neuritogenesis; or bind Zn²⁺, Ca²⁺, Mg²⁺, Cd²⁺ and/or Mn²⁺. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:54, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[3669] Moreover, nucleic acid molecules encoding other 33410 family members and, thus, which have a nucleotide sequence which differs from the 33410 sequences of SEQ ID NO:53 or SEQ ID NO:55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ are intended to be within the scope of the invention.

[3670] Antisense Nucleic Acid Molecules, Ribozymes and Modified 33410 Nucleic Acid Molecules

[3671] In another aspect, the invention features an isolated nucleic acid molecule that is antisense to 33410. An “antisense” nucleic acid can include a nucleotide sequence that is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 33410 coding strand, or to only a portion thereof (e.g., the coding region of human 33410 corresponding to SEQ ID NO:55). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 33410 (e.g., the 5′ and 3′untranslated regions).

[3672] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 33410 mRNA, but more preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding region of 33410 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 33410 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[3673] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[3674] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 33410 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong po II or pol III promoter are preferred.

[3675] In yet another embodiment, the antisense nucleic acid molecule of the invention is an α-anomeric nucleic acid molecule. An α-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual β-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[3676] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 33410-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 33410 cDNA disclosed herein (i.e., SEQ ID NO:53 or SEQ ID NO:55), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 33410-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 33410 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[3677] 33410 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 33410 (e.g., the 33410 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 33410 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6(6):569-84; Helene, C. et al. (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14(12):807-15. Potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[3678] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or calorimetric.

[3679] A 33410 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4 (1): 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra; Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[3680] PNAs of 33410 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 33410 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[3681] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (See, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (See, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[3682] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 33410 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 33410 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[3683] Isolated 33410 Polypeptides

[3684] In another aspect, the invention features an isolated 33410 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-33410 antibodies. 33410 protein can be isolated from cells or tissue sources using standard protein purification techniques. 33410 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[3685] Polypeptides of the invention include those that arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems that result in the alteration or omission of post-translational modifications present when expressed in a native cell, e.g., glycosylation or cleavage.

[3686] In a preferred embodiment, A 33410 polypeptide has one or more of the following characteristics:

[3687] (i) it acts on carboxylic esters, e.g., acetylcholinesterase;

[3688] (ii) it modulates cell-cell (e.g., neuron-neuron, or neuron-glia) recognition event or adhesion;

[3689] (iii) it binds an extracellular component or cell surface protein, e.g., neurexin;

[3690] (iv) it modulates neural developmental events, for example, membrane excitability, neurite outgrowth, and/or synaptogenesis;

[3691] (v) it has an amino acid composition of SEQ ID NO:54;

[3692] (vi) it has an overall sequence similarity of at least 60%, preferably at least 70, more preferably at least 80, 90, or 95%, with a polypeptide of SEQ ID NO:54;

[3693] (vii) it can be found in human tissue;

[3694] (viii) it can be found in neural tissues, e.g., neurons;

[3695] (ix) it has a carboxylesterase domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues from about 42 to 601 of SEQ ID NO:54;

[3696] (x) it has a carboxylesterase type-B signature 2 domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues from about 139 to 149 of SEQ ID NO:54;

[3697] (xi) it has a transmembrane domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues from about 676 to 698 of SEQ ID NO:54; or

[3698] (xii) it has at least 1, preferably 5, and most preferably 6 of the cysteines found in the amino acid sequence of the native protein (i.e., SEQ ID NO:54).

[3699] In a preferred embodiment, the 33410 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID:2. In one embodiment, it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:54 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:54. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are preferably differences or changes at a non-essential residue or a conservative substitution. In a preferred embodiment, the differences are not in the carboxylesterase domain (e.g., amino acids 42 to 601 of SEQ ID NO:54). In another preferred embodiment, one or more differences are in the carboxylesterase domain (e.g., amino acids 42 to 601 of SEQ ID NO:54).

[3700] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 33410 proteins differ in amino acid sequence from SEQ ID NO:54, yet retain biological activity.

[3701] In one embodiment, the protein includes an amino acid sequence at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:54.

[3702] A 33410 protein or fragment is provided which varies from the sequence of SEQ ID NO:0.54 in regions defined by amino acids about 42 to 601 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:54 in regions defined by amino acids about 42 to 601. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[3703] In one embodiment, a biologically active portion of a 33410 protein includes an carboxylesterase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 33410 protein.

[3704] In a preferred embodiment, the 33410 protein has an amino acid sequence shown in SEQ ID NO:54. In other embodiments, the 33410 protein is substantially identical to SEQ ID NO:54. In yet another embodiment, the 33410 protein is substantially identical to SEQ ID NO:54 and retains the functional activity of the protein of SEQ ID NO:54, as described in detail in the subsections above. Accordingly, in another embodiment, the 33410 protein is a protein which includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more identical to SEQ ID NO:54.

[3705] In a preferred embodiment, a fragment differs by at least 1, 2, 3, 10, 20, or more amino acid residues from a sequence in WO 01/27277.7. Differ can include differing in length or sequence identity. E.g., a fragment can: include one or more amino acid residues from SEQ ID NO:54 outside the region encoded by nucleotides 420-2320; not include all of the amino acid residues of a sequence in WO 01/27277.7, e.g., can be one or more amino acid residues shorter (at one or both ends) than a sequence in WO 01/27277.7; or can differ by one or more amino acid residues in the region of overlap.

[3706] In a preferred embodiment, a fragment differs by at least 1, 2, 3, 10, 20, or more amino acid residues from a sequence in WO 01/27277.8. Differ can include differing in length or sequence identity. E.g., a fragment can: include one or more amino acid residues from SEQ ID NO:54 outside the region encoded by nucleotides 420-2372; not include all of the amino acid residues of a sequence in WO 01/27277.8, e.g., can be one or more amino acid residues shorter (at one or both ends) than a sequence in WO 01/27277.8; or can differ by one or more amino acid residues in the region of overlap.

[3707] 33410 Chimeric or Fusion Proteins

[3708] In another aspect, the invention provides 33410 chimeric or fusion proteins. As used herein, a 33410 “chimeric protein” or “fusion protein” includes a 33410 polypeptide linked to a non-33410 polypeptide. A “non-33410 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 33410 protein, e.g., a protein which is different from the 33410 protein and which is derived from the same or a different organism. The 33410 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 33410 amino acid sequence. In a preferred embodiment, a 33410 fusion protein includes at least one (or two) biologically active portion of a 33410 protein. The non-33410 polypeptide can be fused to the N-terminus or C-terminus of the 33410 polypeptide.

[3709] The fusion protein can include a moiety that has a high affinity for a ligand. For example, the fusion protein can be a GST-33410 fusion protein in which the 33410 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 33410. Alternatively, the fusion protein can be a 33410 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 33410 can be increased through use of a heterologous signal sequence.

[3710] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[3711] The 33410 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 33410 fusion proteins can be used to affect the bioavailability of a 33410 substrate. 33410 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 33410 protein; (ii) mis-regulation of the 33410 gene; and (iii) aberrant post-translational modification of a 33410 protein.

[3712] Moreover, the 33410-fusion proteins of the invention can be used as immunogens to produce anti-33410 antibodies in a subject, to purify 33410 ligands and in screening assays to identify molecules that inhibit the interaction of 33410 with a 33410 substrate.

[3713] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 33410-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 33410 protein.

[3714] Variants of 33410 Proteins

[3715] In another aspect, the invention also features a variant of a 33410 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 33410 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 33410 protein. An agonist of the 33410 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 33410 protein. An antagonist of a 33410 protein can inhibit one or more of the activities of the naturally occurring form of the 33410 protein by, for example, competitively modulating a 33410-mediated activity of a 33410 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 33410 protein.

[3716] Variants of a 33410 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 33410 protein for agonist or antagonist activity.

[3717] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 33410 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 33410 protein.

[3718] Variants in which a cysteine residue is added or deleted or in which a residue that is glycosylated is added or deleted are particularly preferred.

[3719] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 33410 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6(3):327-331).

[3720] Cell based assays can be exploited to analyze a variegated 33410 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 33410 in a substrate-dependent manner. The transfected cells are then contacted with 33410 and the effect of the expression of the mutant on signaling by the 33410 substrate can be detected, e.g., by measuring the amount of binding to neurexins on neurons; the ability to act as a cell surface receptor; the cell adhesion properties; the ability to mediate cell-cell interactions between neurons; the ability to regulate inter-neuronal recognition pathways for axon pathfinding; the ability to regulate neuritogenesis; or the ability to bind Zn²⁺, Ca²⁺, Mg²⁺, Cd²⁺ and/or Mn²⁺. Plasmid DNA can then be recovered from those cells that score for inhibition, or alternatively, potentiation of signaling by the 33410 substrate, and the individual clones further characterized.

[3721] In another aspect, the invention features a method of making a 33410 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 33410 polypeptide, e.g., a naturally occurring 33410 polypeptide. The method includes: altering the sequence of a 33410 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[3722] In another aspect, the invention features a method of making a fragment or analog of a 33410 polypeptide a biological activity of a naturally occurring 33410 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 33410 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[3723] Anti-33410 Antibodies

[3724] In another aspect, the invention provides an anti-33410 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[3725] The anti-33410 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[3726] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 Kd or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 kD or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[3727] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 33410 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-33410 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)₂ fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[3728] The anti-33410 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[3729] Phage display and combinatorial methods for generating anti-33410 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[3730] In one embodiment, the anti-33410 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[3731] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[3732] An anti-33410 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[3733] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J Natl Cancer Inst. 80:1553-1559).

[3734] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 33410 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[3735] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[3736] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region that are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 33410 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[3737] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method that may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[3738] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[3739] In preferred embodiments an antibody can be made by immunizing with purified 33410 antigen, or a fragment thereof, e.g., a fragment described herein, membrane associated antigen, tissue, e.g., crude tissue preparations, whole cells, preferably living cells, lysed cells, or cell fractions, e.g., membrane fractions.

[3740] A full-length 33410 protein or, antigenic peptide fragment of 33410 can be used as an immunogen or can be used to identify anti-33410 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 33410 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:54 and encompasses an epitope of 33410. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[3741] Fragments of 33410 which include, for example, residues 330-350, 480-505, or 695-720 of SEQ ID NO:54 can be used to make, e.g., antibodies against hydrophilic regions of the 33410 protein or used as immunogens or to characterize the specificity of an antibody. Similarly, a fragment of 33410 which include, for example, residues 60-72, 260-277, or 780-793 of SEQ ID NO:54 can be used to make an antibody against a hydrophobic region of the 33410 protein; a fragment of 33410 which include residues 42-601 of SEQ ID NO:54 can be used to make an antibody against an extracellular region of the 33410 protein; a fragment of 33410 which include residues 699-835 of SEQ ID NO:54 can be used to make an antibody against an intracellular region of the 33410 protein; a fragment of 33410 which include residues 42-601 or 139-149 of SEQ ID NO:54 can be used to make an antibody against a carboxylesterase region of the 33410 protein.

[3742] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[3743] Antibodies which bind only native 33410 protein, only denatured or otherwise non-native 33410 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by indentifying antibodies that bind to native but not denatured 33410 protein.

[3744] Preferred epitopes encompassed by the antigenic peptide are regions of 33410 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 33410 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 33410 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[3745] The anti-33410 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 33410 protein.

[3746] In a preferred embodiment the antibody has: effector function; and can fix complement. In other embodiments the antibody does not; recruit effector cells; or fix complement.

[3747] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example., it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[3748] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diptheria toxin or active fragment hereof, or a radionuclide, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[3749] An anti-33410 antibody (e.g., monoclonal antibody) can be used to isolate 33410 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-33410 antibody can be used to detect 33410 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-33410 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labeling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or ³H.

[3750] The invention also includes a nucleic acid that encodes an anti-33410 antibody, e.g., an anti-33410 antibody described herein. Also included are vectors which include the nucleic acid and sells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[3751] The invention also includes cell lines, e.g., hybridomas, which make an anti-33410 antibody, e.g., and antibody described herein, and method of using said cells to make a 33410 antibody.

[3752] 33410 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[3753] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[3754] A vector can include a 33410 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those that direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 33410 proteins, mutant forms of 33410 proteins, fusion proteins, and the like).

[3755] The recombinant expression vectors of the invention can be designed for expression of 33410 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990). Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[3756] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[3757] Purified fusion proteins can be used in 33410 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 33410 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells that are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[3758] One strategy used to maximize recombinant protein expression in E. coli is to express the protein in a host strain with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990) 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[3759] The 33410 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[3760] When used in mammalian cells, the expression vector's control functions are often provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[3761] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the α-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[3762] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus. For a discussion of the regulation of gene expression using antisense genes see Weintraub, H. et al., Antisense RNA as a molecular tool for genetic analysis, Reviews—Trends in Genetics, Vol. 1(1) 1986.

[3763] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 33410 nucleic acid molecule within a recombinant expression vector or a 33410 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell, but also to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[3764] A host cell can be any prokaryotic or eukaryotic cell. For example, a 33410 protein can be expressed in bacterial cells such as E. coli, insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are known to those skilled in the art.

[3765] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[3766] A host cell of the invention can be used to produce (i.e., express) a 33410 protein. Accordingly, the invention further provides methods for producing a 33410 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 33410 protein has been introduced) in a suitable medium such that a 33410 protein is produced. In another embodiment, the method further includes isolating a 33410 protein from the medium or the host cell.

[3767] In another aspect, the invention features a cell or purified preparation of cells which include a 33410 transgene, or which otherwise misexpress 33410. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 33410 transgene, e.g., a heterologous form of a 33410, e.g., a gene derived from humans (in the case of a non-human cell). The 33410 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that misexpress an endogenous 33410, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 33410 alleles or for use in drug screening.

[3768] In another aspect, the invention features a human cell, e.g., a neurological cell, transformed with nucleic acid that encodes a subject 33410 polypeptide.

[3769] Also provided are cells, preferably human cells, e.g., human neurological cells (e.g., neurons), in which an endogenous 33410 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 33410 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 33410 gene. For example, an endogenous 33410 gene that is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[3770] 33410 Transgenic Animals

[3771] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 33410 protein and for identifying and/or evaluating modulators of 33410 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 33410 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[3772] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 33410 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 33410 transgene in its genome and/or expression of 33410 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 33410 protein can further be bred to other transgenic animals carrying other transgenes.

[3773] 33410 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[3774] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[3775] Uses of 33410

[3776] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[3777] The isolated nucleic acid molecules of the invention can be used, for example, to express a 33410 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 33410 mRNA (e.g., in a biological sample) or a genetic alteration in a 33410 gene, and to modulate 33410 activity, as described further below. The 33410 proteins can be used to treat disorders characterized by insufficient or excessive production of a 33410 substrate or production of 33410 inhibitors. In addition, the 33410 proteins can be used to screen for naturally occurring 33410 substrates, to screen for drugs or compounds which modulate 33410 activity, as well as to treat disorders characterized by insufficient or excessive production of 33410 protein or production of 33410 protein forms which have decreased, aberrant or unwanted activity compared to 33410 wild type protein (e.g., neurological disorders and/or carcinomas). Moreover, the anti-33410 antibodies of the invention can be used to detect and isolate 33410 proteins, regulate the bioavailability of 33410 proteins, and modulate 33410 activity.

[3778] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 33410 polypeptide is provided. The method includes: contacting the compound with the subject 33410 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 33410 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 33410 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 33410 polypeptide. Screening methods are discussed in more detail below.

[3779] 33410 Screening Assays

[3780] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 33410 proteins, have a stimulatory or inhibitory effect on, for example, 33410 expression or 33410 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 33410 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 33410 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[3781] In one embodiment, the invention provides assays for screening candidate or test compounds that are substrates of a 33410 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate the activity of a 33410 protein or polypeptide or a biologically active portion thereof.

[3782] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. J. Med. Chem. 1994, 37: 2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam, K. S. (1997) Anticancer Drug Des. 12:145).

[3783] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and in Gallop et al. (1994) J. Med. Chem. 37:1233.

[3784] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. '409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390); (Devlin (1990) Science 249:404-406); (Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382); (Felici (1991) J. Mol. Biol. 222:301-310); (Ladner supra.).

[3785] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 33410 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 33410 activity is determined. Determining the ability of the test compound to modulate 33410 activity can be accomplished by monitoring, for example, binding to neurexins on neurons; the ability to act as a cell surface receptor; cell adhesion properties; the ability to mediate cell-cell interactions between neurons; the ability to regulate inter-neuronal recognition pathways for axon pathfinding; the ability to regulate neuritogenesis; or binding Zn²⁺, Ca²⁺, Mg²⁺, Cd²⁺ and/or Mn²⁺. The cell, for example, can be of mammalian origin, e.g., human.

[3786] The ability of the test compound to modulate 33410 binding to a compound, e.g., a 33410 substrate, or to bind to 33410 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 33410 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 33410 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 33410 binding to a 33410 substrate in a complex. For example, compounds (e.g., 33410 substrates) can be labeled with ¹²⁵I, ³⁵S, ¹⁴C, or ³H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[3787] The ability of a compound (e.g., a 33410 substrate) to interact with 33410 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 33410 without the labeling of either the compound or the 33410. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 33410.

[3788] In yet another embodiment, a cell-free assay is provided in which a 33410 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 33410 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 33410 proteins to be used in assays of the present invention include fragments that participate in interactions with non-33410 molecules, e.g., fragments with high surface probability scores.

[3789] Soluble and/or membrane-bound forms of isolated proteins (e.g., 33410 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton®g X-114, Thesit®, Isotridecypoly(ethylene glycol ether)_(n), 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-11-propane sulfonate.

[3790] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[3791] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[3792] In another embodiment, determining the ability of the 33410 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal that can be used as an indication of real-time reactions between biological molecules.

[3793] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[3794] It may be desirable to immobilize either 33410, an anti-33410 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 33410 protein, or interaction of a 33410 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/33410 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 33410 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 33410 binding or activity determined using standard techniques.

[3795] Other techniques for immobilizing either a 33410 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 33410 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[3796] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[3797] In one embodiment, this assay is performed utilizing antibodies reactive with 33410 protein or target molecules but which do not interfere with binding of the 33410 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 33410 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 33410 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 33410 protein or target molecule.

[3798] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., Trends Biochem Sci 1993 August;18(8):284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., J Mol Recognit 1998 Winter; 11(1-6):141-8; Hage, D. S., and Tweed, S. A. J Chromatogr B Biomed Sci Appl 1997 Oct. 10;699(1-2):499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[3799] In a preferred embodiment, the assay includes contacting the 33410 protein or biologically active portion thereof with a known compound which binds 33410 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 33410 protein, wherein determining the ability of the test compound to interact with a 33410 protein includes determining the ability of the test compound to preferentially bind to 33410 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[3800] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 33410 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 33410 protein through modulation of the activity of a downstream effector of a 33410 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[3801] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[3802] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[3803] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[3804] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[3805] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[3806] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[3807] In yet another aspect, the 33410 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 33410 (“33410-binding proteins” or “33410-bp”) and are involved in 33410 activity. Such 33410-bps can be activators or inhibitors of signals by the 33410 proteins or 33410 targets as, for example, downstream elements of a 33410-mediated signaling pathway.

[3808] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 33410 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 33410 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 33410-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) that is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene that encodes the protein that interacts with the 33410 protein.

[3809] In another embodiment, modulators of 33410 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 33410 mRNA or protein evaluated relative to the level of expression of 33410 mRNA or protein in the absence of the candidate compound. When expression of 33410 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 33410 mRNA or protein expression. Alternatively, when expression of 33410 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 33410 mRNA or protein expression. The level of 33410 mRNA or protein expression can be determined by methods described herein for detecting 33410 mRNA or protein.

[3810] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 33410 protein can be confirmed in vivo, e.g., in an animal such as an animal model for neurological disorders and/or carcinomas.

[3811] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 33410 modulating agent, an antisense 33410 nucleic acid molecule, a 33410-specific antibody, or a 33410-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[3812] 33410 Detection Assays

[3813] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 33410 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[3814] 33410 Chromosome Mapping

[3815] The 33410 nucleotide sequences or portions thereof can be used to map the location of the 33410 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 33410 sequences with genes associated with disease.

[3816] Briefly, 33410 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 33410 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 33410 sequences will yield an amplified fragment.

[3817] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[3818] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 33410 to a chromosomal location.

[3819] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques (Pergamon Press, New York 1988).

[3820] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to non-coding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[3821] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[3822] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 33410 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[3823] 33410 Tissue Typing

[3824] 33410 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[3825] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 33410 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[3826] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:53 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:55 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[3827] If a panel of reagents from 33410 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[3828] Use of Partial 33410 Sequences in Forensic Biology

[3829] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[3830] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:53 (e.g., fragments derived from the noncoding regions of SEQ ID NO:53 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[3831] The 33410 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 33410 probes can be used to identify tissue by species and/or by organ type.

[3832] In a similar fashion, these reagents, e.g., 33410 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[3833] Predictive Medicine of 33410

[3834] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[3835] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 33410.

[3836] Such disorders include, e.g., a disorder associated with the misexpression of 33410 gene; a neoplasia or a disorder of the neurological system.

[3837] The method includes one or more of the following:

[3838] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 33410 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[3839] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 33410 gene;

[3840] detecting, in a tissue of the subject, the misexpression of the 33410 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[3841] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 33410 polypeptide.

[3842] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 33410 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[3843] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:53, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 33410 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[3844] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 33410 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 33410.

[3845] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[3846] In preferred embodiments the method includes determining the structure of a 33410 gene, an abnormal structure being indicative of risk for the disorder.

[3847] In preferred embodiments the method includes contacting a sample form the subject with an antibody to the 33410 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[3848] Diagnostic and Prognostic Assays of 33410

[3849] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 33410 molecules and for identifying variations and mutations in the sequence of 33410 molecules.

[3850] Expression Monitoring and Profiling:

[3851] The presence, level, or absence of 33410 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 33410 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 33410 protein such that the presence of 33410 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 33410 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 33410 genes; measuring the amount of protein encoded by the 33410 genes; or measuring the activity of the protein encoded by the 33410 genes.

[3852] The level of mRNA corresponding to the 33410 gene in a cell can be determined both by in situ and by in vitro formats.

[3853] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 33410 nucleic acid, such as the nucleic acid of SEQ ID NO:53, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 33410 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[3854] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 33410 genes.

[3855] The level of mRNA in a sample that is encoded by one of 33410 can be evaluated with nucleic acid amplification, e.g., by RT-PCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[3856] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 33410 gene being analyzed.

[3857] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 33410 mRNA, or genomic DNA, and comparing the presence of 33410 mRNA or genomic DNA in the control sample with the presence of 33410 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 33410 transcript levels.

[3858] A variety of methods can be used to determine the level of protein encoded by 33410. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)₂) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[3859] The detection methods can be used to detect 33410 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 33410 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 33410 protein include introducing into a subject a labeled anti-33410 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-33410 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[3860] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 33410 protein, and comparing the presence of 33410 protein in the control sample with the presence of 33410 protein in the test sample. The invention also includes kits for detecting the presence of 33410 in a biological sample. For example, the kit can include a compound or agent capable of detecting 33410 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 33410 protein or nucleic acid.

[3861] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[3862] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein-stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples that can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[3863] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 33410 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as pain or deregulated cell proliferation. In one embodiment, a disease or disorder associated with aberrant or unwanted 33410 expression or activity is identified. A test sample is obtained from a subject and 33410 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 33410 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 33410 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[3864] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 33410 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a disease or disorder associated with misexpressed or aberrant or unwanted 33410 expression or activity.

[3865] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 33410 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 33410 (e.g., other genes associated with a 33410-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[3866] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 33410 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a disease or disorder associated with misexpressed or aberrant or unwanted 33410 expression or activity in a subject wherein an increase or a decrease in 33410 expression is an indication that the subject has or is disposed to having a disease or disorder associated with misexpressed or aberrant or unwanted 33410 expression or activity. The method can be used to monitor a treatment for a disease or disorder associated with misexpressed or aberrant or unwanted 33410 expression or activity in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[3867] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 33410 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from a cell not contacted with the test compound.

[3868] In another aspect, the invention features a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 33410 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[3869] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[3870] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 33410 expression.

[3871] 33410 Arrays and Uses Thereof

[3872] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 33410 molecule (e.g., a 33410 nucleic acid or a 33410 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm², and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[3873] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 33410 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 33410. Each address of the subset can include a capture probe that hybridizes to a different region of a 33410 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 33410 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 33410 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 33410 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[3874] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[3875] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 33410 polypeptide or fragment thereof. The polypeptide can be a naturally occurring interaction partner of 33410 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-33410 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[3876] In another aspect, the invention features a method of analyzing the expression of 33410. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 33410-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[3877] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 33410. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 33410. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[3878] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 33410 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[3879] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[3880] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 33410-associated disease or disorder; and processes, such as a cellular transformation associated with a 33410-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 33410-associated disease or disorder

[3881] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 33410) that could serve as a molecular target for diagnosis or therapeutic intervention.

[3882] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 33410 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 33410 polypeptide or fragment thereof. For example, multiple variants of a 33410 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[3883] The polypeptide array can be used to detect a 33410 binding compound, e.g., an antibody in a sample from a subject with specificity for a 33410 polypeptide or the presence of a 33410-binding protein or ligand.

[3884] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 33410 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[3885] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 33410 or from a cell or subject in which a 33410 mediated response has been elicited, e.g., by contact of the cell with 33410 nucleic acid or protein, or administration to the cell or subject 33410 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 33410 (or does not express as highly as in the case of the 33410 positive plurality of capture probes) or from a cell or subject which in which a 33410 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 33410 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[3886] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 33410 or from a cell or subject in which a 33410-mediated response has been elicited, e.g., by contact of the cell with 33410 nucleic acid or protein, or administration to the cell or subject 33410 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 33410 (or does not express as highly as in the case of the 33410 positive plurality of capture probes) or from a cell or subject which in which a 33410 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[3887] In another aspect, the invention features a method of analyzing 33410, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 33410 nucleic acid or amino acid sequence; comparing the 33410 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 33410.

[3888] Detection of 33410 Variations or Mutations

[3889] The methods of the invention can also be used to detect genetic alterations in a 33410 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 33410 protein activity or nucleic acid expression, such as a neurological disorder and/or carcinomas. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 33410-protein, or the mis-expression of the 33410 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 33410 gene; 2) an addition of one or more nucleotides to a 33410 gene; 3) a substitution of one or more nucleotides of a 33410 gene, 4) a chromosomal rearrangement of a 33410 gene; 5) an alteration in the level of a messenger RNA transcript of a 33410 gene, 6) aberrant modification of a 33410 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 33410 gene, 8) a non-wild type level of a 33410-protein, 9) allelic loss of a 33410 gene, and 10) inappropriate post-translational modification of a 33410-protein.

[3890] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 33410-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 33410 gene under conditions such that hybridization and amplification of the 33410-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[3891] In another embodiment, mutations in a 33410 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[3892] In other embodiments, genetic mutations in 33410 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 33410 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 33410 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 33410 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[3893] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 33410 gene and detect mutations by comparing the sequence of the sample 33410 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[3894] Other methods for detecting mutations in the 33410 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[3895] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 33410 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[3896] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 33410 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 33410 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[3897] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[3898] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[3899] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[3900] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 33410 nucleic acid.

[3901] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:53 or the complement of SEQ ID NO:53. Different locations can be different but overlapping or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[3902] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 33410. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[3903] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the T_(m) of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[3904] In a preferred embodiment the set of oligonucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 33410 nucleic acid.

[3905] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 33410 gene.

[3906] Use of 33410 Molecules as Surrogate Markers

[3907] The 33410 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 33410 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 33410 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker that correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[3908] The 33410 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker that correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 33410 marker) transcription or expression, the amplified marker may be in a quantity that is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-33410 antibodies may be employed in an immune-based detection system for a 33410 protein marker, or 33410-specific radiolabeled probes may be used to detect a 33410 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[3909] The 33410 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker that correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 33410 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 33410 DNA may correlate with a 33410 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[3910] Pharmaceutical Compositions of 33410

[3911] The nucleic acid and polypeptides, fragments thereof, as well as anti-33410 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifingal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[3912] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[3913] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including an agent in the composition that delays absorption, for example, aluminum monostearate and gelatin.

[3914] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle that contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[3915] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[3916] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser that contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[3917] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[3918] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[3919] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[3920] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[3921] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED₅₀ (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD₅₀/ED₅₀. Compounds that exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[3922] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED₅₀ with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC₅₀ (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography. As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[3923] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration are often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[3924] The present invention encompasses agents that modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[3925] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[3926] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive metal ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine and vinblastine).

[3927] The conjugates of the invention can be used for modifying a given biological response; the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, alpha-interferon, beta-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[3928] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[3929] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[3930] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[3931] Methods of Treatment for 33410

[3932] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 33410 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[3933] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 33410 molecules of the present invention or 33410 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[3934] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 33410 expression or activity, by administering to the subject a 33410 or an agent which modulates 33410 expression or at least one 33410 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 33410 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 33410 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 33410 aberrance, for example, a 33410, 33410 agonist or 33410 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[3935] It is possible that some 33410 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[3936] The 33410 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, and neurodegenerative disorders as described above, as well as immune or inflammatory disorders, cardiovascular disorders, disorders associated with bone metabolism, liver disorders, viral diseases, pain or metabolic disorders.

[3937] The 33410 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of hematopoieitic disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[3938] Examples of disorders involving the heart or “cardiovascular disorder” include, but are not limited to, a disease, disorder, or state involving the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. Examples of such disorders include hypertension, atherosclerosis, coronary artery spasm, congestive heart failure. coronary artery disease, valvular disease, arrhythmias, and cardiomyopathies.

[3939] Aberrant expression and/or activity of 33410 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 33410 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 33410 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 33410 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[3940] Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-anticarboxylesterase deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[3941] Additionally, 33410 molecules may play an important role in the etiology of certain viral diseases, inducing but not limited to Hepatitis B, Heptitis C and Herpes Simplex Virus (HSV). Modulators of 33410 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 33410 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[3942] Additionally, 33410 may play an important role in the regulation of metabolism or pain disorders. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with muscoloskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1 &u=/netahtml/search-bool.html&r=3&f=G&1=50&co 1=AND&d=curr&s 1=millennium.ASNM.&s2=ain&OS=A N/millennium+AND+pain&RS=AN/-h3http://164.195.100.11/netacgi/nph-Parser?Sect 1=PTO2&Sect2=HITOFF&p=1 &u=/netahtml/search-bool.html&r=3&f=G&1=50&co 1 AND&d=curr&s 1 l=millennium.ASNM.&s2=pain&OS=A N/millennium+AND+pain&RS=AN/-h5pain related to irritable bowel syndrome; or chest http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1 &u=/netahtml/search-bool.html&r=3&f=G&1=50&co 1=AND&d=curr&s1=millennium.ASNM.&s2=pain&OS=A N/millennium+AND+pain&RS=AN/-h4http://164.195.100.11/netacgi/nph-Parser?Sect I=PTO2&Sect2=HITOFF&p=1 &u=/netahtml/search-bool.html&r=3&f=G&1=50&co 1=AND&d=curr&s 1 l=millennium.ASNM.&s2=pain&OS=A N/millennium+AND+pain&RS=AN/-h6pain.

[3943] As discussed, successful treatment of 33410 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 33410 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)₂ and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[3944] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[3945] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[3946] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 33410 expression is through the use of aptamer molecules specific for 33410 protein. Aptamers are nucleic acid molecules having a tertiary structure that permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. Curr. Opin. Chem Biol. 1997, 1(1): 5-9; and Patel, D. J. Curr Opin Chem Biol 1997 June; 1(1):32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 33410 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[3947] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 33410 disorders. For a description of antibodies, see the Antibody section above.

[3948] In circumstances wherein injection of an animal or a human subject with a 33410 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 33410 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. Ann Med 1999;31(1):66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. Cancer Treat Res 1998;94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 33410 protein. Vaccines directed to a disease characterized by 33410 expression may also be generated in this fashion.

[3949] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993, Proc. Natl. Acad. Sci. USA 90:7889-7893).

[3950] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 33410 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders.

[3951] Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 33410 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix that contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope labeling, the “free” concentration of compound that modulates the expression or activity of 33410 can be readily monitored and used in calculations of IC₅₀.

[3952] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC₅₀. A rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[3953] Another aspect of the invention pertains to methods of modulating 33410 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 33410 or agent that modulates one or more of the activities of 33410 protein activity associated with the cell. An agent that modulates 33410 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 33410 protein (e.g., a 33410 substrate or receptor), a 33410 antibody, a 33410 agonist or antagonist, a peptidomimetic of a 33410 agonist or antagonist, or other small molecule.

[3954] In one embodiment, the agent stimulates one or 33410 activities. Examples of such stimulatory agents include active 33410 protein and a nucleic acid molecule encoding 33410. In another embodiment, the agent inhibits one or more 33410 activities. Examples of such inhibitory agents include antisense 33410 nucleic acid molecules, anti-33410 antibodies, and 33410 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 33410 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 33410 expression or activity. In another embodiment, the method involves administering a 33410 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 33410 expression or activity.

[3955] Stimulation of 33410 activity is desirable in situations in which 33410 is abnormally down-regulated and/or in which increased 33410 activity is likely to have a beneficial effect. For example, stimulation of 33410 activity is desirable in situations in which a 33410 is down-regulated and/or in which increased 33410 activity is likely to have a beneficial effect. Likewise, inhibition of 33410 activity is desirable in situations in which 33410 is abnormally upregulated and/or in which decreased 33410 activity is likely to have a beneficial effect.

[3956] 33410 Pharmacogenomics

[3957] The 33410 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 33410 activity (e.g., 33410 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 33410 associated disorders (e.g., neurological disorders and/or carcinomas) associated with aberrant or unwanted 33410 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 33410 molecule or 33410 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 33410 molecule or 33410 modulator.

[3958] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23(10-11):983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43(2):254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[3959] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high-resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[3960] Alternatively, a method termed the “candidate gene approach” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 33410 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[3961] Alternatively, a method termed the “gene expression profiling” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 33410 molecule or 33410 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[3962] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 33410 molecule or 33410 modulator, such as those identified by one of the exemplary screening assays described herein.

[3963] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 33410 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 33410 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[3964] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 33410 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 33410 gene expression, protein levels, or up-regulate 33410 activity, can be monitored in clinical trials of subjects exhibiting decreased 33410 gene expression, protein levels, or down-regulated 33410 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 33410 gene expression, protein levels, or down-regulate 33410 activity, can be monitored in clinical trials of subjects exhibiting increased 33410 gene expression, protein levels, or upregulated 33410 activity. In such clinical trials, the expression or activity of a 33410 gene, and preferably, other genes that have been implicated in, for example, a 33410-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[3965] 33410 Informatics

[3966] The sequence of a 33410 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 33410. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 33410 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[3967] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[3968] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[3969] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[3970] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[3971] Thus, in one aspect, the invention features a method of analyzing 33410, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 33410 nucleic acid or amino acid sequence; comparing the 33410 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 33410. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[3972] The method can include evaluating the sequence identity between a 33410 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[3973] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[3974] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[3975] Thus, the invention features a method of making a computer readable record of a sequence of a 33410 sequence that includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[3976] In another aspect, the invention features a method of analyzing a sequence. The method includes: providing a 33410 sequence, or record, in machine-readable form; comparing a second sequence to the 33410 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 33410 sequence includes a sequence being compared. In a preferred embodiment the 33410 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 33410 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[3977] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 33410-associated disease or disorder or a pre-disposition to a 33410-associated disease or disorder, wherein the method comprises the steps of determining 33410 sequence information associated with the subject and based on the 33410 sequence information, determining whether the subject has a 33410-associated disease or disorder or a pre-disposition to a 33410-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[3978] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 33410-associated disease or disorder or a pre-disposition to a disease associated with a 33410 wherein the method comprises the steps of determining 33410 sequence information associated with the subject, and based on the 33410 sequence information, determining whether the subject has a 33410-associated disease or disorder or a pre-disposition to a 33410-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 33410 sequence of the subject to the 33410 sequences in the database to thereby determine whether the subject as a 33410-associated disease or disorder, or a pre-disposition for such.

[3979] The present invention also provides in a network, a method for determining whether a subject has a 33410 associated disease or disorder or a pre-disposition to a 33410-associated disease or disorder associated with 33410, said method comprising the steps of receiving 33410 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 33410 and/or corresponding to a 33410-associated disease or disorder (e.g., a neurological disorder and/or carcinomas), and based on one or more of the phenotypic information, the 33410 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 33410-associated disease or disorder or a pre-disposition to a 33410-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[3980] The present invention also provides a method for determining whether a subject has a 33410-associated disease or disorder or a pre-disposition to a 33410-associated disease or disorder, said method comprising the steps of receiving information related to 33410 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 33410 and/or related to a 33410-associated disease or disorder, and based on one or more of the phenotypic information, the 33410 information, and the acquired information, determining whether the subject has a 33410-associated disease or disorder or a pre-disposition to a 33410-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[3981] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 33521 Invention

[3982] Rho small GTPases are critical cellular regulators. Rho proteins regulate the formation of actin structures, particularly stress fibers and focal adhesions. For example, the Rho family member, Rac regulates membrane ruffling and lamellipodia formation in fibroblasts. Cdc42, another family member, regulates filopodia formation. Rho proteins also have more disparate roles in cell cycle progression, gene transcription through the serum response factor, Ras-mediated transformation, and NADPH oxidase-mediated phagocytosis (see Fleming et al. (1999) J Biol Chem 274:12753 for a review). Rho proteins, like other small GTPases, cycle between two different conformations: one bound to GTP and one to GDP. The different conformations have different regulatory properties (see, e.g., Park et al. (1997) Proc. Natl. Acad. Sci. USA 94:4463-8.) The GTP bound protein can hydrolyze GTP to convert to the GDP-bound form. This event is stimulated by a GTPase-activating protein (GAP). Likewise, the GDP-bound form can convert to the GTP bound form by releasing the bound GDP and exchanging it for GTP. This step is stimulated by a GDP exchange factor (GEF). Thus, GAPs and GEFs are key regulators in controlling Rho-mediated signalling events.

[3983] One particular class of Rho GEFs is the TIAM class which have the hallmark domain organization of—from amino- to carboxy-terminus—a first plekstrin homology (PH) domain, a Raf-like Ras binding domain (RBD), a PDZ domain, a Rho GEF domain, and a second PH domain. The PH, PDZ, and RBD domains are common small modules of signalling proteins. PH domains, as described below, can potentially interact with lipids, membranes, and inositol signalling molecules. PDZ domain are peptide binding domains, that frequently recognize the carboxy-termini of proteins. RBD domains are involved in signalling by Ras, a key mediator of cell proliferation and oncogenesis.

[3984] The TIAM class of GEFs include TIAM1, TIAM2, Still-life protein type 1 (SIF type 1), and Still-life protein type 2 (SIF type 2). TIAM1 was originally identified as a T-lymphoma invasion and metastasis inducing protein. TIAM1 can induce the invasion of T-lymphoma cells into a fibroblast monolayer by activating Rac1, a Rho class GTPase. Activation of Rac1 activity can modulate the assembly of adherens junctions, and cell adhesion, e.g., E-cadherin-mediated adhesion (Hordijk et al. (1997) Science 278:1464-1466; Habets et al. (1994) Cell 77:537; Fichiels et al. (1995) Nature 375:338). TIAM1 can be stimulated by a variety of signals. For example, hyaluronic acid binding to CD44 results in a CD44 interaction with TIAM1 that increases Rac1 activity (Bourguigon (2000) J Biol Chem 275:1829). Thus, extracellular environment and signals can regulate cell motility and morphogenesis through the TIAM1 protein. The Drosophila TIAM1 homolog, SIF, is similarly involved in controlling cell morphogenesis, particular the morphology of synaptic terminals (Sone et al. (1997) Science 275:543).

[3985] In sum, the TIAM1 class of GEFs are key intracellular signalling molecules which have multiple protein domains, including a GEF domain that regulates Rho GTPases. These proteins are likely to be critical in a variety of physiological systems, including, but not limited to immune cells, epithelial cells, and neurons as well as pathological systems such as tumor cells and metastatic cells.

Summary of the 33521 Invention

[3986] The present invention is based, in part, on the discovery of a novel Rho GDP exchange factor (Rho GEF) family member, referred to herein as “33521”. The nucleotide sequence of a cDNA encoding 33521 is shown in SEQ ID NO:61, and the amino acid sequence of a 33521 polypeptide is shown in SEQ ID NO:62. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:63.

[3987] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 33521 protein or polypeptide, e.g., a biologically active portion of the 33521 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:62. In other embodiments, the invention provides isolated 33521 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:61, SEQ ID NO:63, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:61, SEQ ID NO:63, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:61, SEQ ID NO:63, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 33521 protein or an active fragment thereof.

[3988] In a related aspect, the invention further provides nucleic acid constructs that include a 33521 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 33521 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 33521 nucleic acid molecules and polypeptides.

[3989] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 33521-encoding nucleic acids.

[3990] In still another related aspect, isolated nucleic acid molecules that are antisense to a 33521 encoding nucleic acid molecule are provided.

[3991] In another aspect, the invention features, 33521 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 33521-mediated or -related disorders. In another embodiment, the invention provides 33521 polypeptides having a 33521 activity. Preferred polypeptides are 33521 proteins including at least one Rho GEF domain; at least one, preferably two PH domains; at least one PDZ domain; and, preferably, having a 33521 activity, e.g., a 33521 activity as described herein.

[3992] In other embodiments, the invention provides 33521 polypeptides, e.g., a 33521 polypeptide having the amino acid sequence shown in SEQ ID NO:62 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:62 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:61, SEQ ID NO:63, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 33521 protein or an active fragment thereof.

[3993] In a related aspect, the invention further provides nucleic acid constructs which include a 33521 nucleic acid molecule described herein.

[3994] In a related aspect, the invention provides 33521 polypeptides or fragments operatively linked to non-33521 polypeptides to form fusion proteins.

[3995] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 33521 polypeptides or fragments thereof.

[3996] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 33521 polypeptides or nucleic acids.

[3997] In still another aspect, the invention provides a process for modulating 33521 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 33521 polypeptides or nucleic acids, such as conditions involving aberrant or deficient cellular proliferation or differentiation.

[3998] The invention also provides assays for determining the activity of or the presence or absence of 33521 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis.

[3999] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing, of a 33521-expressing cell, e.g., a hyper-proliferative 33521-expressing cell. The method includes contacting the cell with a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 33521 polypeptide or nucleic acid. In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol. In a preferred embodiment, the cell is a hyperproliferative cell, e.g., a cell found in a solid tumor, a soft tissue tumor, or a metastatic lesion.

[4000] In a preferred embodiment, the compound is an inhibitor of a 33521 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another preferred embodiment, the compound is an inhibitor of a 33521 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[4001] In a preferred embodiment, the compound is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[4002] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant cellular proliferation or differentiation of a 33521-expressing cell, in a subject. Preferably, the method includes administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 33521 polypeptide or nucleic acid. In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition.

[4003] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., proliferative disorder or a disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 33521 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 33521 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 33521 nucleic acid or polypeptide expression can be detected by any method described herein.

[4004] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 33521 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[4005] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 33521 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 33521 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 33521 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue.

[4006] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 33521 polypeptide or nucleic acid molecule, including for disease diagnosis.

[4007] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 33521 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 33521 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 33521 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[4008] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 33521

[4009] The human 33521 sequence (see SEQ ID NO:61, as recited in Example 43), which is approximately 5437 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 5106 nucleotides, including the termination codon. The coding sequence encodes a 1701 amino acid protein (see SEQ ID NO:62, as recited in Example 43).

[4010] Human 33521 contains the following regions or other structural features:

[4011] two PH (pleckstrin homology) domains (PFAM Accession Number PF00169) located at about amino acid residues 507 to 620 and 1353 to 1455 of SEQ ID NO:62;

[4012] an RBD domain (PFAM Accession Number PF02196) located at about amino acid residues 810 to 881 of SEQ ID NO:62;

[4013] a PDZ domain (PFAM Accession Number PF00595) located at about amino acid residues 890 to 975 of SEQ ID NO:62;

[4014] a Rho guanine nucleotide exchange factor domain (PFAM Accession Number PF00621) located at about amino acid residues 1103 to 1292 of SEQ ID NO:62;

[4015] a predicted coiled-coil domain located at about amino acids 629 to 694 of SEQ ID NO:62;

[4016] 30 predicted protein kinase C phosphorylation sites (PS00005) at about amino acids 29 to 31, 82 to 84, 93 to 95, 129 to 131, 158 to 160, 225 to 227, 237 to 239, 298 to 300, 330 to 332, 370 to 372, 375 to 377, 391 to 393, 440 to 442, 541 to 543, 632 to 634, 755 to 757, 767 to 769, 962 to 964, 1046 to 1048, 1054 to 1056, 1110 to 1112, 1220 to 1222, 1314 to 1316, 1471 to 1473, 1486 to 1488, 1491 to 1493, 1494 to 1496, 1510 to 1512, 1578 to 1580, and 1641 to 1643, of SEQ ID NO:62;

[4017] 45 predicted casein kinase II phosphorylation sites (PS00006) located at about amino acids 129 to 132, 218 to 221, 229 to 232, 248 to 251, 261 to 264, 298 to 301, 337 to 340, 411 to 414, 471 to 474, 478 to 481, 485 to 488, 491 to 494, 577 to 580, 593 to 596, 604 to 607, 741 to 744, 786 to 789, 797 to 800, 898 to 901, 950 to 953, 984 to 987, 1019 to 1022, 1036 to 1039, 1046 to 1049, 1067 to 1070, 1095 to 1098, 1136 to 1139, 1146 to 1149, 1160 to 1163, 1239 to 1242, 1264 to 1267, 1302 to 1305, 1312 to 1315, 1322 to 1325, 1384 to 1387, 1414 to 1417, 1421 to 1424, 1438 to 1441, 1471 to 1474, 1502 to 1505, 1514 to 1517, 1569 to 1572, 1642 to 1645, 1661 to 1664, and 1672 to 1675, of SEQ ID NO:62;

[4018] twelve predicted N-glycosylation sites (PS00001) located at about amino acids 15 to 18, 144 to 147, 291 to 294, 638 to 641, 1003 to 1006, 1069 to 1072, 1131 to 1134, 1382 to 1385, 1412 to 1415, 1500 to 1503, 1640 to 1643, and 1665 to 1668, of SEQ ID NO:62;

[4019] six predicted cAMP/cGMP to dependent protein kinase phosphorylation sites (PS00004) located at about amino acids 215 to 218, 372 to 375, 381 to 384, 748 to 751, 781 to 784, and 1016 to 1019, of SEQ ID NO:62;

[4020] twenty predicted N-myristylation sites (PS00008) from about amino acids 2 to 7, 49 to 54, 79 to 84, 88 to 93, 105 to 110, 180 to 185, 240 to 245, 315 to 320, 324 to 329, 422 to 427, 763 to 768, 821 to 826, 928 to 933, 934 to 939, 1032 to 1037, 1164 to 1169, 1405 to 1410, 1521 to 1526, 1531 to 1536, and 1618 to 1623, of SEQ ID NO:62;

[4021] one predicted glycosaminoglycan site (PS00002) located at about amino acids 1615 to 1618 of SEQ ID NO:62; and

[4022] three predicted tyrosine kinase phosphorylation sites (PS00007) located at about amino acids 401 to 408, 843 to 850, and 856 to 864, of SEQ ID NO:62.

[4023] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[4024] A plasmid containing the nucleotide sequence encoding human 33521 (clone “Fbh33521FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[4025] The 33521 protein has the domain organization of the TIAM1 family of proteins. The organization is as follows: a first PH domain, a RBD domain, a PDZ domain, a Rho GEF domain, and a second PH domain. The 33521 protein contains a significant number of structural characteristics in common with members of the Rho GEF family, the PH domain family, and the PDZ domain family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[4026] Rho GEF domains are also referred to as Dbl homology domains. The Rho GEF domain typically functions in conjunction with a carboxy-terminal PH domain. The crystal structure of the Rho GEF from the human Son of sevenless (Sos) protein and its adjacent carboxy-terminal PH domain has been determined (Soisson et al. (1998) Cell 95:259-68). The Rho GEF domain is entirely α-helical with the active site residues positioned near the interface with the PH domain. These two domains may function together in binding the effector region of Rho proteins to thereby trigger GDP release.

[4027] A 33521 polypeptide can include a “Rho GEF domain” or regions homologous with a “Rho GEF domain”.

[4028] As used herein, the term “Rho GEF domain” includes an amino acid sequence of about 170 to 190 amino acid residues in length and having a bit score for the alignment of the sequence to the Rho GEF domain (HMM) of at least 145. Preferably, a Rho GEF domain includes at least about 150 to 220 amino acids, more preferably about 160 to 200 amino acid residues, or about 170 to 190 amino acids and has a bit score for the alignment of the sequence to the Rho GEF domain (HMM) of at least 50, 100, preferably 180, or more preferably 200 or greater. The Rho GEF domain (HMM) has been assigned the PFAM Accession Number PF00621 (http;//genome.wustl.edu/Pfam/.html). An alignment of the Rho GEF domain (amino acids 1103 to 1292 of SEQ ID NO:62) of human 33521 with a PFAM consensus amino acid sequence (SEQ ID NO:67) derived from a hidden Markov model is depicted in FIG. 31D. An alignment of the Rho GEF domain (amino acids 1103 to 1292 of SEQ ID NO:62) of human 33521 with a SMART consensus amino acid sequence (SEQ ID NO:72) derived from a hidden Markov model is depicted in FIG. 32D.

[4029] In a preferred embodiment 33521 polypeptide or protein has a “Rho GEF domain” or a region which includes at least about 150 to 220 more preferably about 160 to 200 or 170 to 190 amino acid residues and has at least about 50%, 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “Rho GEF domain,” e.g., the Rho GEF domain of human 33521 (e.g., residues 1103 to 1292 of SEQ ID NO:62).

[4030] To identify the presence of a “Rho GEF” domain in a 33521 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against the Pfam database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of a “Rho GEF” domain in the amino acid sequence of human 33521 at about residues 1103 to 1292 of SEQ ID NO:62 (see Example 43).

[4031] PH domains are small domains found in a diverse class of proteins, including those involved in intracellular signaling and the cytoskeleton. The domain has been implicated in binding the βγ subunit of heterotrimeric G proteins, lipids (e.g., ascorbyl sterates, and phosphoinositols), and phosphoserine and phosphothreonine. The ligand specificity of PH domains can vary from domain to domain. The structures of multiple PH domains have been determined (for a review, see Riddihough (1994) Nat. Struct. Biol. 1:755-757). From analysis of such structures, it is evident that the PH domains represent a conserved fold consisting of two perpendicular β-sheets followed by an amphipathic α-helix despite the lack of absolutely conserved residues. However, the loop regions differ greatly in length and composition. Many Rho GEF domain protein also contain a PH domain. Such proteins included vav, dbl, Sos, yeast CDC24, TIAM1, and Still-life (SIF) proteins.

[4032] A 33521 polypeptide can include a “PH domain” or regions homologous with a “PH domain”.

[4033] As used herein, the term “PH domain” includes an amino acid sequence of about 80 to 120 amino acid residues in length and having a bit score for the alignment of the sequence to the PH domain (HMM) of at least 60. Preferably, a PH domain includes at least about 80 to 130 amino acids, more preferably about 90 to 120 amino acid residues, or about 100 to 115 amino acids and has a bit score for the alignment of the sequence to the PH domain (HMM) of at least 5, 10, 20, 40, 60 or greater. The PH domain (HMM) has been assigned the PFAM Accession Number PF00169 (http;//genome.wustl.edu/Pfam/.html). Alignments of the PH domains (amino acids 507 to 620 and 1353 to 1455 of SEQ ID NO:62) of human 33521 with a consensus amino acid sequence (SEQ ID NO:64) derived from a PFAM hidden Markov model are depicted in FIGS. 31A and 31E. Similar alignments with the PH domain consensus from SMART are depicted in FIGS. 32A and 32E.

[4034] In a preferred embodiment 33521 polypeptide or protein has a “PH domain” or a region which includes at least about 80 to 130 more preferably about 90 to 120 or 100 to 115 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “PH domain,” e.g., the first PH domain of human 33521 (e.g., residues 507 to 620 of SEQ ID NO:62) or the second PH domain of human 33521 (e.g., residues 1353 to 1455 of SEQ ID NO:62).

[4035] To identify the presence of a “PH” domain in a 33521 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of HMMs, e.g., as described above. A “PH” domain was located by such a search in the amino acid sequence of human 33521 at about residues 507 to 620 and 1353 to 1455 of SEQ ID NO:62 (see Example 43).

[4036] PDZ domains (also referred to as DHR or GLGF domains) are found in proteins from bacteria to humans. The canonical PDZ domain is a largely β-sheet structure, which has a ligand binding pocket for extended peptides. In eukaryotes, PDZ domain frequently bind to carboxy-terminal peptides, such as the carboxy-terminal tetrapeptide X-[ST]-X-V. The peptide can bind in a groove, and form a β-strand which is antiparallel to a β-strand in the domain, thus, completing an anti-parallel β-sheet. The peptide binding groove can also contain a conserved loop which provides binding pocket for the carboxy-terminus of the ligand. The carboxylate at the terminus hydrogen bonds to main chain amides of the loop, which is characterized by a conserved glycine and a conserved phenylalanine. PDZ domain-containing proteins can mediate protein-protein interactions in signaling complexes, as well as a host of other functions, such as the clustering of ion channels that terminate in PDZ-binding peptides.

[4037] A 33521 polypeptide can include a “PDZ domain” or regions homologous with a “PDZ domain”.

[4038] As used herein, the term “PDZ domain” includes an amino acid sequence of about 80 to 90 amino acid residues in length and having a bit score for the alignment of the sequence to the PDZ domain (HMM) of at least 30. Preferably, a PDZ domain includes at least about 70 to 120 amino acids, more preferably about 80 to 110 amino acid residues, or about 80 to 90 amino acids and has a bit score for the alignment of the sequence to the PDZ domain (HMM) of at least 5, 10, 20, 25, 30, 32 or greater. The PDZ domain (HMM) has been assigned the PFAM Accession Number PF00595 (http;//genome.wustl.edu/Pfam/.html). Preferably, a PDZ further includes a conserved glycine-phenylalanine dipeptide in the carboxylate ligand binding loop. An alignment of the PDZ domain (amino acids 890 to 975 of SEQ ID NO:62) of human 33521 with a consensus amino acid sequence (SEQ ID NO:66) derived from a PFAM hidden Markov model is depicted in FIG. 31C. An alignment of the PDZ domain (amino acids 900 to 976 of SEQ ID NO:62) of human 33521 with a consensus amino acid sequence (SEQ ID NO:71) derived from a SMART hidden Markov model is depicted in FIG. 32C.

[4039] In a preferred embodiment 33521 polypeptide or protein has a “PDZ domain” or a region which includes at least about 70 to 120 more preferably about 80 to 110 or 80 to 90 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “PDZ domain,” e.g., the PDZ domain of human 33521 (e.g., residues 890 to 975 of SEQ ID NO:62). The 33521 polypeptide as the conserved “GF” dipeptide motif at about residues 903 to 904 of SEQ ID NO:62. To identify the presence of a “PDZ” domain in a 33521 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of Pfam HMMs, e.g., as described above. A “PDZ domain” was located by such a search in the amino acid sequence of human 33521 at about residues 890 to 975 of SEQ ID NO:62 (see Example 43).

[4040] An RBD domain, or “Ras-binding” domain can be found on a variety of cell signalling molecules. The paradigm for this domain is the RBD of the Raf serine-threonine kinase, which is a key component of the Ras signalling pathway. Ras is a critical small GTPase which regulates cell proliferation. The central anti-parallel β-sheet of the RBD interacts with two β-strands from Ras effector region by means of side chain and main chain interactions (Nassar et al. (1995) Nature 375:554-6).

[4041] 33521 polypeptide can include a “RBD domain” or regions homologous with a “RBD domain”.

[4042] As used herein, the term “RBD domain” includes an amino acid sequence of about 65 to 80 amino acid residues in length and having a bit score for the alignment of the sequence to the RBD domain consensus from the SMART database of HMM of at least 65. Preferably, a RBD domain includes at least about 40 to 100 amino acids, more preferably about 50 to 90 amino acid residues, or about 65 to 80 amino acids and has a bit score for the alignment of the sequence to the RBD domain consensus derived from the SMART database (HMM) of at least 10, 20, 30, 40, 50, 60 or greater. An alignment of the RBD domain (amino acids 810 to 873 of SEQ ID NO:62) of human 33521 with a consensus amino acid sequence (SEQ ID NO:65) derived from a PFAM hidden Markov model is depicted in FIG. 31B. An alignment of the RBD domain (amino acids 810 to 881 of SEQ ID NO:62) of human 33521 with a consensus amino acid sequence (SEQ ID NO:70) derived from a SMART hidden Markov model is depicted in FIG. 32B.

[4043] In a preferred embodiment 33521 polypeptide or protein has a “RBD domain” or a region which includes at least about 40 to 100 more preferably about 50 to 90 or 65 to 80 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “RBD domain,” e.g., the RBD domain of human 33521 (e.g., residues 810 to 881 of SEQ ID NO:62).

[4044] To identify the presence of a “RBD” domain in a 33521 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a SMART database (Simple Modular Architecture Research Tool, http://smart.embl-heidelberg.de/) of HMMs as described in Schultz et al. (1998), Proc. Natl. Acad. Sci. USA 95:5857 and Schultz et al. (2000) Nucl. Acids Res 28:231. The database contains domains identified by profiling with the hidden Markov models of the HMMer2 search program (R. Durbin et al. (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press.; http://hmmer.wustl.edu/). The database also is extensively annotated and monitored by experts to enhance accuracy. A “RBD domain” was located by such a search in the amino acid sequence of human 33521 at about residues 810 to 880 of SEQ ID NO:62 (see Example 43).

[4045] A 33521 molecule can include: at least one and preferably two PH domains; an RBD domain; a PDZ domain; and a Rho GEF domain.

[4046] A 33521 molecule can further include at least one coiled-coil domain; at least one, two, four, six, eight, ten, twenty, twenty-five, or preferably thirty protein kinase C phosphorylation sites; at least one, two, four, six, eight, ten, 20, 25, 30, 35, 40, or preferably 45 casein kinase II phosphorylation sites; at least one, two, three, four, five, six, seven, eight, nine, ten, eleven, or preferably twelve N-glycosylation sites; at least one, two, three, four, five, or preferably six cAMP-cGMP; at least one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, fifteen, eighteen, or preferably twenty predicted N-myristylation sites; at least one predicted glycosaminoglycan site; and at least one, two, or preferably three tyrosine kinase phosphorylation sites.

[4047] As the 33521 polypeptides of the invention may modulate 33521-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 33521-mediated or related disorders, as described below.

[4048] As used herein, a “33521 activity”, “biological activity of 33521” or “functional activity of 33521”, refers to an activity exerted by a 33521 protein, polypeptide or nucleic acid molecule. For example, a 33521 activity can be an activity exerted by 33521 in a physiological milieu on, e.g., a 33521-responsive cell or on a 33521 substrate, e.g., a protein substrate. A 33521 activity can be determined in vivo or in vitro. In one embodiment, a 33521 activity is a direct activity, such as an association with a 33521 target molecule. A “target molecule” or “binding partner” is a molecule with which a 33521 protein binds or interacts in nature, e.g., a Rac guanine nucleotide exchange factor (GEF).

[4049] A 33521 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 33521 protein with a 33521 receptor. The features of the 33521 molecules of the present invention can provide similar biological activities as Rho GEF family members. For example, the 33521 proteins of the present invention can have one or more of the following activities: (1) modulation of cell morphology, e.g., regulation of focal adhesions, or stress fibers; (2) modulation of cell motility, e.g., regulation of lamellipodia, and leading edges; (3) modulation of cell adhesion, e.g., cadherin-mediated adhesion, or adherens junctions; (4) activation of small GTPases by stimulating guanine nucleotide exchange; (5) signaling in response to small lipophilic second messengers, e.g., phosphoinositols, or ascorbyl stereates; (6) signaling in response to phosphorylation events, e.g., phosphorylation of 3352 predicted tyrosine phosphorylation sites or predicted protein kinase C phosphorylation sites; (7) signaling in response to cell surface receptors, e.g., CD44; and (8) signaling in response to activation of Ras or Gβγ proteins.

[4050] Thus, the 33521 molecules can act as novel diagnostic targets and therapeutic agents for controlling cell proliferation and differentiation disorders, including metastatic disorders.

[4051] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[4052] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth. Examples of such cells include cells having an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[4053] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[4054] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[4055] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[4056] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin. A hematopoietic neoplastic disorder can arise from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[4057] The 33521 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:62 thereof are collectively referred to as “polypeptides or proteins of the invention” or “33521 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “33521 nucleic acids.” 33521 molecules refer to 33521 nucleic acids, polypeptides, and antibodies.

[4058] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[4059] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[4060] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[4061] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO:61 or SEQ ID NO:63, corresponds to a naturally-occurring nucleic acid molecule.

[4062] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein. As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding a 33521 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 33521 protein or derivative thereof.

[4063] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 33521 protein is at least 10% pure. In a preferred embodiment, the preparation of 33521 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-33521 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-33521 chemicals. When the 33521 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[4064] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 33521 without abolishing or substantially altering a 33521 activity. Preferably the alteration does not substantially alter the 33521 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 33521, results in abolishing a 33521 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 33521 are predicted to be particularly unamenable to alteration.

[4065] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 33521 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 33521 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 33521 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:61 or SEQ ID NO:63, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[4066] As used herein, a “biologically active portion” of a 33521 protein includes a fragment of a 33521 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 33521 molecule and a non-33521 molecule or between a first 33521 molecule and a second 33521 molecule (e.g., a dimerization interaction). Biologically active portions of a 33521 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 33521 protein, e.g., the amino acid sequence shown in SEQ ID NO:62, which include less amino acids than the full length 33521 proteins, and exhibit at least one activity of a 33521 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 33521 protein, e.g., guanine nucleotide exchange. A biologically active portion of a 33521 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 33521 protein can be used as targets for developing agents which modulate a 33521 mediated activity, e.g., guanine nucleotide exchange.

[4067] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[4068] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[4069] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[4070] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[4071] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[4072] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 33521 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 33521 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[4073] Particularly preferred 33521 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:62. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:62 are termed substantially identical.

[4074] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:61 or 63 are termed substantially identical.

[4075] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[4076] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[4077] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[4078] Various aspects of the invention are described in further detail below.

[4079] Isolated Nucleic Acid Molecules of 33521

[4080] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 33521 polypeptide described herein, e.g., a full-length 33521 protein or a fragment thereof, e.g., a biologically active portion of 33521 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 33521 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[4081] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:61, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 33521 protein (i.e., “the coding region” of SEQ ID NO:61, as shown in SEQ ID NO:63), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:61 (e.g., SEQ ID NO:63) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein from about amino acid 1103 to 1292 of SEQ ID NO:62. In still other embodiments, the nucleic acid molecule encodes a sequence corresponding to one of the following fragments of the 33521 protein: from about amino acid 1 to 507, 507 to 620, 620 to 810, 810 to 853, 853 to 890, 890 to 975, 975 to 1103, 1292 to 1353, and 1353 to 1455, of SEQ ID NO:62.

[4082] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:61 or SEQ ID NO:63, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:61 or SEQ ID NO:63, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:61 or 63, thereby forming a stable duplex.

[4083] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:61 or SEQ ID NO:63, or a portion, preferably of the same length, of any of these nucleotide sequences.

[4084] 33521 Nucleic Acid Fragments

[4085] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:61 or 63. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of a 33521 protein, e.g., an immunogenic or biologically active portion of a 33521 protein. A fragment can comprise those nucleotides of SEQ ID NO:61, which encode a Rho GEF domain of human 33521. Other fragments include those encoding a PH domain, a PDZ domain, or a RBD domain of human 33521. The nucleotide sequence determined from the cloning of the 33521 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 33521 family members, or fragments thereof, as well as 33521 homologues, or fragments thereof, from other species.

[4086] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 50 amino acids in length, more preferably 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1650, 1670, 1680, 1685, 1690, 1695, 1700, or 1701 amino acids in length.

[4087] Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[4088] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 33521 nucleic acid fragment can include a sequence corresponding to a Rho GEF domain, a PH domain, a PDZ domain, or an RBD domain.

[4089] 33521 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:61 or SEQ ID NO:63, or of a naturally occurring allelic variant or mutant of SEQ ID NO:61 or SEQ ID NO:63. Preferably, an oligonucleotide is less than about 200, 150, 120, or 100 nucleotides in length.

[4090] In one embodiment, the probe or primer is attached to a solid support, e.g., a solid support described herein.

[4091] One exemplary kit of primers includes a forward primer that anneals to the coding strand and a reverse primer that anneals to the non-coding strand. The forward primer can anneal to the start codon, e.g., the nucleic acid sequence encoding amino acid residue 1 of SEQ ID NO:62. The reverse primer can anneal to the ultimate codon, e.g., the codon immediately before the stop codon, e.g., the codon encoding amino acid residue 1701 of SEQ ID NO:62. In a preferred embodiment, the annealing temperatures of the forward and reverse primers differ by no more than 5, 4, 3, or 2° C.

[4092] In a preferred embodiment the nucleic acid is a probe which is at least 10, 12, 15, 18, 20 and less than 200, more preferably less than 100, or less than 50, nucleotides in length. It should be identical, or differ by 1, or 2, or less than 5 or 10 nucleotides, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[4093] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes: a Rho GEF domain, a PH domain, a PDZ domain, or a RBD domain of human 33521.

[4094] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 33521 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a Rho GEF domain from about amino acid 1103 to 1292 of SEQ ID NO:62; a PH domain from about amino acid 507 to 622 and 1353 to 1457 of SEQ ID NO:62; an RBD domain from about amino acid 810 to 881 of SEQ ID NO:62; or a PDZ domain from about amino acid 890 to 975 of SEQ ID NO:62.

[4095] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[4096] A nucleic acid fragment encoding a “biologically active portion of a 33521 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:61 or 63, which encodes a polypeptide having a 33521 biological activity (e.g., the biological activities of the 33521 proteins are described herein), expressing the encoded portion of the 33521 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 33521 protein. For example, a nucleic acid fragment encoding a biologically active portion of 33521 includes a Rho GEF domain, e.g., amino acid residues about 1103 to 1292 of SEQ ID NO:62. A nucleic acid fragment encoding a biologically active portion of a 33521 polypeptide, may comprise a nucleotide sequence which is greater than 300 or more nucleotides in length.

[4097] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000 or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:61 or SEQ ID NO:63.

[4098] In preferred embodiments, the fragment includes at least one, and preferably at least 5, 10, 15, 25, 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1500, 2000, 2500, 3000, 3500, 4000, or 4500 nucleotides from nucleotides 1-2142, 1-2273, 1-3415, 1-4822, 1-4937, 1-4949, 1-4958, or 1-4973 of SEQ ID NO:61.

[4099] In preferred embodiments, the fragment includes the nucleotide sequence of SEQ ID NO:63 and at least one, and preferably at least 5, 10, 15, 25, 50, 75, 100, 150, 200, or 240 consecutive nucleotides of SEQ ID NO:61.

[4100] In preferred embodiments, the fragment includes at least one, and preferably at least 5, 10, 15, 25, 50, 75, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000, or more nucleotides encoding a protein including at least 5, 10, 15, 20, 25, 30, 40, 50, 100, 150, 200, 250, 300, 350, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, or 1700 consecutive amino acids of SEQ ID NO:62. In one embodiment, the encoded protein includes at least 5, 10, 15, 20, 25, 30, 40, 50, 100, 150, 200, 250, 300, 400, 500, or 600 consecutive amino acids from residues 1-631 of SEQ ID NO:62.

[4101] In preferred embodiments, the nucleic acid fragment includes a nucleotide sequence that is other than, e.g., differs by at least one, two, three of more nucleotides from, a sequence in WO 00/40607 or in GenBank™ Accession numbers AI094945, AI126294, AW338968, AW026228, AI982584, AI589050, BE504999, AL122086, AF120323, AF120324, AF195656.

[4102] In preferred embodiments, the fragment comprises the coding region of 33521, e.g., the nucleotide sequence of SEQ ID NO:66.

[4103] 33521 Nucleic Acid Variants

[4104] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:61 or SEQ ID NO:63. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 33521 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:62. If alignment is needed for this comparison the sequences should be aligned for maximum homology. The encoded protein can differ by no more than 5, 4, 3, 2, or 1 amino acid. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[4105] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[4106] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[4107] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:61 or 63, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. The nucleic acid can differ by no more than 5, 4, 3, 2, or 1 nucleotide. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[4108] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO:62 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO:62 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 33521 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 33521 gene.

[4109] Preferred variants include those that are correlated with guanine nucleotide exchange activity.

[4110] Allelic variants of 33521, e.g., human 33521, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 33521 protein within a population that maintain the ability to bind a Rho family protein, e.g., Rac. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:62, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 33521, e.g., human 33521, protein within a population that do not have the ability to bind a Rho family protein, e.g., Rac. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:62, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[4111] Moreover, nucleic acid molecules encoding other 33521 family members and, thus, which have a nucleotide sequence which differs from the 33521 sequences of SEQ ID NO:61 or SEQ ID NO:63 are intended to be within the scope of the invention.

[4112] Antisense Nucleic Acid Molecules, Ribozymes and Modified 33521 Nucleic Acid Molecules

[4113] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 33521. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 33521 coding strand, or to only a portion thereof (e.g., the coding region of human 33521 corresponding to SEQ ID NO:63). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 33521 (e.g., the 5′ and 3′untranslated regions).

[4114] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 33521 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 33521 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 33521 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[4115] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[4116] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 33521 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[4117] In yet another embodiment, the antisense nucleic acid molecule of the invention is an α-anomeric nucleic acid molecule. An α-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual β-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[4118] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 33521-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 33521 cDNA disclosed herein (i.e., SEQ ID NO:61 or SEQ ID NO:63), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 33521-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 33521 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[4119] 33521 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 33521 (e.g., the 33521 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 33521 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[4120] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or colorimetric.

[4121] A 33521 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[4122] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[4123] PNAs of 33521 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 33521 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[4124] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[4125] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 33521 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 33521 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[4126] Isolated 33521 Polypeptides

[4127] In another aspect, the invention features, an isolated 33521 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-33521 antibodies. 33521 protein can be isolated from cells or tissue sources using standard protein purification techniques. 33521 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[4128] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[4129] In a preferred embodiment, a 33521 polypeptide has one or more of the following characteristics:

[4130] (i) has the ability to stimulate guanine nucleotide exchange of a Rho protein, e.g., Rac;

[4131] (ii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of SEQ ID NO:62;

[4132] (iii) it has an overall sequence similarity of at least 50%, preferably at least 60%, more preferably at least 70, 80, 90, or 95%, with a polypeptide a of SEQ ID NO:62;

[4133] (iv) it has a Rho GEF domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues about 1103 to 1292 of SEQ ID NO:62;

[4134] (v) it has two PH domains which are preferably about 70%, 80%, 90% or 95% with amino acid residues about 507 to 620, and 1353 to 1455, of SEQ ID NO:62;

[4135] (vi) it has a RBD domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues about 810 to 881 of SEQ ID NO:62; or

[4136] (vii) it has a PDZ domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues about 890 to 975 of SEQ ID NO:62.

[4137] In a preferred embodiment the 33521 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:62. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:62 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:62. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the Rho GEF domain. In another preferred embodiment one or more differences are in the Rho GEF domain.

[4138] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 33521 proteins differ in amino acid sequence from SEQ ID NO:62, yet retain biological activity.

[4139] In one embodiment, the protein includes an amino acid sequence at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:62.

[4140] A 33521 protein or fragment is provided which varies from the sequence of SEQ ID NO:62 in regions defined by amino acids about 1 to 1102, and 1293 to 1701 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:62 in regions defined by amino acids about 1103 to 1292. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[4141] In one embodiment, a biologically active portion of a 33521 protein includes a Rho GEF domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 33521 protein.

[4142] In a preferred embodiment, the 33521 protein has an amino acid sequence shown in SEQ ID NO:62. In other embodiments, the 33521 protein is substantially identical to SEQ ID NO:62. In yet another embodiment, the 33521 protein is substantially identical to SEQ ID NO:62 and retains the functional activity of the protein of SEQ ID NO:62, as described in detail in the subsections above.

[4143] 33521 Chimeric or Fusion Proteins

[4144] In another aspect, the invention provides 33521 chimeric or fusion proteins. As used herein, a 33521 “chimeric protein” or “fusion protein” includes a 33521 polypeptide linked to a non-33521 polypeptide. A “non-33521 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 33521 protein, e.g., a protein which is different from the 33521 protein and which is derived from the same or a different organism. The 33521 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 33521 amino acid sequence. In a preferred embodiment, a 33521 fusion protein includes at least one (or two) biologically active portion of a 33521 protein. The non-33521 polypeptide can be fused to the N-terminus or C-terminus of the 33521 polypeptide.

[4145] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-33521 fusion protein in which the 33521 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 33521. Alternatively, the fusion protein can be a 33521 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 33521 can be increased through use of a heterologous signal sequence.

[4146] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[4147] The 33521 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 33521 fusion proteins can be used to affect the bioavailability of a 33521 substrate. 33521 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 33521 protein; (ii) mis-regulation of the 33521 gene; and (iii) aberrant post-translational modification of a 33521 protein.

[4148] Moreover, the 33521-fusion proteins of the invention can be used as immunogens to produce anti-33521 antibodies in a subject, to purify 33521 ligands and in screening assays to identify molecules which inhibit the interaction of 33521 with a 33521 substrate.

[4149] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 33521-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 33521 protein.

[4150] Variants of 33521 Proteins

[4151] In another aspect, the invention also features a variant of a 33521 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 33521 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 33521 protein. An agonist of the 33521 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 33521 protein. An antagonist of a 33521 protein can inhibit one or more of the activities of the naturally occurring form of the 33521 protein by, for example, competitively modulating a 33521-mediated activity of a 33521 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 33521 protein.

[4152] Variants of a 33521 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 33521 protein for agonist or antagonist activity.

[4153] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 33521 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 33521 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[4154] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 33521 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 33521 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[4155] Cell based assays can be exploited to analyze a variegated 33521 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 33521 in a substrate-dependent manner. The transfected cells are then contacted with 33521 and the effect of the expression of the mutant on signaling by the 33521 substrate can be detected, e.g., by measuring stimulation of guanine nucleotide exchange on a Rho protein. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 33521 substrate, and the individual clones further characterized.

[4156] In another aspect, the invention features a method of making a 33521 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 33521 polypeptide, e.g., a naturally occurring 33521 polypeptide. The method includes: altering the sequence of a 33521 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[4157] In another aspect, the invention features a method of making a fragment or analog of a 33521 polypeptide a biological activity of a naturally occurring 33521 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 33521 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[4158] Anti-33521 Antibodies

[4159] In another aspect, the invention provides an anti-33521 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[4160] The anti-33521 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[4161] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[4162] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 33521 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-33521 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)₂ fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[4163] The anti-33521 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[4164] Phage display and combinatorial methods for generating anti-33521 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[4165] In one embodiment, the anti-33521 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[4166] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[4167] An anti-33521 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[4168] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[4169] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 33521 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[4170] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[4171] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 33521 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[4172] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[4173] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[4174] In preferred embodiments an antibody can be made by immunizing with purified 33521 antigen, or a fragment thereof, e.g., a fragment described herein.

[4175] A full-length 33521 protein or, antigenic peptide fragment of 33521 can be used as an immunogen or can be used to identify anti-33521 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 33521 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:62 and encompasses an epitope of 33521. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[4176] Fragments of 33521 which include residues about amino acid 741 to 750, from about 756 to 762, and from about 1363 to 1372, of SEQ ID NO:62; can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody, antibodies against hydrophilic regions of the 33521 protein. Similarly, fragments of 33521 which include residues about amino acid 722 to 730, from about 883 to 891, and from about 966 to 975, of SEQ ID NO:62 can be used to make an antibody against a hydrophobic region of the 33521 protein; a fragment of 33521 which includes residues 507 to 620 and 1353 to 1455 of SEQ ID NO:62 can be used to make an antibody against the PH domains of the 33521 protein; a fragment of 33521 which includes residues about 810 to 881 of SEQ ID NO:62 can be used to make an antibody against an RBD domain of the 33521 protein; a fragment of 33521 which includes residues about 1103 to 1292 of SEQ ID NO:62 can be used to make an antibody against the Rho GEF region of the 33521 protein; a fragment of 33521 which includes residues about 890 to 975 of SEQ ID NO:62 can be used to make an antibody against the PDZ region of the 33521 protein.

[4177] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[4178] Antibodies which bind only native 33521 protein, only denatured or otherwise non-native 33521 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 33521 protein.

[4179] Preferred epitopes encompassed by the antigenic peptide are regions of 33521 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 33521 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 33521 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[4180] In a preferred embodiment the antibody can bind to the Rho GEF portion of the 33521 protein. In another embodiment, the antibody binds a PH domain, a PDZ domain, or RBD domain portion of the 33521 protein.

[4181] The anti-33521 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 33521 protein.

[4182] In a preferred embodiment the antibody has effector function and/or can fix complement. In other embodiments the antibody does not recruit effector cells; or fix complement.

[4183] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example, it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[4184] In a preferred embodiment, an anti-33521 antibody alters (e.g., increases or decreases) the guanine nucleotide exchange activity of a 33521 polypeptide.

[4185] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[4186] An anti-33521 antibody (e.g., monoclonal antibody) can be used to isolate 33521 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-33521 antibody can be used to detect 33521 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-33521 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include ¹²⁵I, ¹³¹I; ³⁵S or ³H.

[4187] The invention also includes a nucleic acid which encodes an anti-33521 antibody, e.g., an anti-33521 antibody described herein. Also included are vectors which include the nucleic acid and cells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[4188] The invention also includes cell lines, e.g., hybridomas, which make an anti-33521 antibody, e.g., and antibody described herein, and method of using said cells to make a 33521 antibody.

[4189] 33521 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[4190] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[4191] A vector can include a 33521 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 33521 proteins, mutant forms of 33521 proteins, fusion proteins, and the like).

[4192] The recombinant expression vectors of the invention can be designed for expression of 33521 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[4193] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[4194] Purified fusion proteins can be used in 33521 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 33521 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[4195] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[4196] The 33521 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[4197] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[4198] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[4199] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the α-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[4200] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[4201] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 33521 nucleic acid molecule within a recombinant expression vector or a 33521 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[4202] A host cell can be any prokaryotic or eukaryotic cell. For example, a 33521 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells (African green monkey kidney cells CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182)). Other suitable host cells are known to those skilled in the art.

[4203] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[4204] A host cell of the invention can be used to produce (i.e., express) a 33521 protein. Accordingly, the invention further provides methods for producing a 33521 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 33521 protein has been introduced) in a suitable medium such that a 33521 protein is produced. In another embodiment, the method further includes isolating a 33521 protein from the medium or the host cell.

[4205] In another aspect, the invention features, a cell or purified preparation of cells which include a 33521 transgene, or which otherwise misexpress 33521. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 33521 transgene, e.g., a heterologous form of a 33521, e.g., a gene derived from humans (in the case of a non-human cell). The 33521 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 33521, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 33521 alleles or for use in drug screening.

[4206] In another aspect, the invention features, a human cell, e.g., a hematopoietic stem cell, transformed with nucleic acid which encodes a subject 33521 polypeptide.

[4207] Also provided are cells, preferably human cells, e.g., human hematopoietic or fibroblast cells, in which an endogenous 33521 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 33521 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 33521 gene. For example, an endogenous 33521 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[4208] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 33521 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 33521 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 33521 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[4209] 33521 Transgenic Animals

[4210] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 33521 protein and for identifying and/or evaluating modulators of 33521 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 33521 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[4211] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 33521 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 33521 transgene in its genome and/or expression of 33521 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 33521 protein can further be bred to other transgenic animals carrying other transgenes.

[4212] 33521 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[4213] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[4214] Uses of 33521

[4215] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[4216] The isolated nucleic acid molecules of the invention can be used, for example, to express a 33521 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 33521 mRNA (e.g., in a biological sample) or a genetic alteration in a 33521 gene, and to modulate 33521 activity, as described further below. The 33521 proteins can be used to treat disorders characterized by insufficient or excessive production of a 33521 substrate or production of 33521 inhibitors. In addition, the 33521 proteins can be used to screen for naturally occurring 33521 substrates, to screen for drugs or compounds which modulate 33521 activity, as well as to treat disorders characterized by insufficient or excessive production of 33521 protein or production of 33521 protein forms which have decreased, aberrant or unwanted activity compared to 33521 wild type protein (e.g., aberrant cell proliferation and/or differentiation). Moreover, the anti-33521 antibodies of the invention can be used to detect and isolate 33521 proteins, regulate the bioavailability of 33521 proteins, and modulate 33521 activity.

[4217] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 33521 polypeptide is provided. The method includes: contacting the compound with the subject 33521 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 33521 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 33521 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 33521 polypeptide. Screening methods are discussed in more detail below.

[4218] 33521 Screening Assays

[4219] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 33521 proteins, have a stimulatory or inhibitory effect on, for example, 33521 expression or 33521 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 33521 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 33521 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[4220] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of a 33521 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 33521 protein or polypeptide or a biologically active portion thereof.

[4221] In one embodiment, an activity of a 33521 protein can be assayed, e.g., assayed for nucleotide exchange activity toward Rho proteins. An example of an such an “exchange assay” can be found, for example, in Fleming et al., J. Biol. Chem. 274: 12753-12758 (1999), wherein the nucleotide exchange activity of TIAM1 toward Rac1 was assayed. Briefly, the activity of TIAM1 was observed by incubating [³H]GDP-Rac1 with TIAM1 for a specified time period, and then measuring the amount of [³H]GDP remaining bound to Rac1 following the incubation period.

[4222] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[4223] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[4224] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[4225] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 33521 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 33521 activity is determined. Determining the ability of the test compound to modulate 33521 activity can be accomplished by monitoring, for example, stimulation of guanine nucleotide exchange on a Rho protein. The cell, for example, can be of mammalian origin, e.g., human.

[4226] The ability of the test compound to modulate 33521 binding to a compound, e.g., a 33521 substrate, or to bind to 33521 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 33521 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 33521 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 33521 binding to a 33521 substrate in a complex. For example, compounds (e.g., 33521 substrates) can be labeled with ¹²⁵I, ³⁵S, ¹⁴C, or ³H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[4227] The ability of a compound (e.g., a 33521 substrate) to interact with 33521 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 33521 without the labeling of either the compound or the 33521. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 33521.

[4228] In yet another embodiment, a cell-free assay is provided in which a 33521 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 33521 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 33521 proteins to be used in assays of the present invention include fragments which participate in interactions with non-33521 molecules, e.g., fragments with high surface probability scores.

[4229] Soluble and/or membrane-bound forms of isolated proteins (e.g., 33521 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)_(n), 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[4230] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[4231] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[4232] In another embodiment, determining the ability of the 33521 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[4233] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[4234] It may be desirable to immobilize either 33521, an anti-33521 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 33521 protein, or interaction of a 33521 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/33521 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 33521 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 33521 binding or activity determined using standard techniques.

[4235] Other techniques for immobilizing either a 33521 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 33521 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[4236] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[4237] In one embodiment, this assay is performed utilizing antibodies reactive with 33521 protein or target molecules but which do not interfere with binding of the 33521 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 33521 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 33521 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 33521 protein or target molecule.

[4238] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11:141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[4239] In a preferred embodiment, the assay includes contacting the 33521 protein or biologically active portion thereof with a known compound which binds 33521 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 33521 protein, wherein determining the ability of the test compound to interact with a 33521 protein includes determining the ability of the test compound to preferentially bind to 33521 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[4240] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 33521 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 33521 protein through modulation of the activity of a downstream effector of a 33521 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[4241] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[4242] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[4243] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[4244] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[4245] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[4246] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[4247] In yet another aspect, the 33521 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 33521 (“33521-binding proteins” or “33521-bp”) and are involved in 33521 activity. Such 33521-bps can be activators or inhibitors of signals by the 33521 proteins or 33521 targets as, for example, downstream elements of a 33521-mediated signaling pathway.

[4248] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 33521 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 33521 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 33521-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 33521 protein.

[4249] In another embodiment, modulators of 33521 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 33521 mRNA or protein evaluated relative to the level of expression of 33521 mRNA or protein in the absence of the candidate compound. When expression of 33521 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 33521 mRNA or protein expression. Alternatively, when expression of 33521 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 33521 mRNA or protein expression. The level of 33521 mRNA or protein expression can be determined by methods described herein for detecting 33521 mRNA or protein.

[4250] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 33521 protein can be confirmed in vivo, e.g., in an animal such as an animal model for disorders of cell proliferation and differentiation, e.g., cancer and metastasis.

[4251] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 33521 modulating agent, an antisense 33521 nucleic acid molecule, a 33521-specific antibody, or a 33521-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[4252] 33521 Detection Assays

[4253] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 33521 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[4254] 33521 Chromosome Mapping

[4255] The 33521 nucleotide sequences or portions thereof can be used to map the location of the 33521 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 33521 sequences with genes associated with disease.

[4256] Briefly, 33521 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 33521 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 33521 sequences will yield an amplified fragment.

[4257] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[4258] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 33521 to a chromosomal location.

[4259] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[4260] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[4261] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[4262] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 33521 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[4263] 33521 Tissue Typing

[4264] 33521 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[4265] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 33521 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[4266] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:61 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:63 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[4267] If a panel of reagents from 33521 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[4268] Use of Partial 33521 Sequences in Forensic Biology

[4269] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[4270] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:61 (e.g., fragments derived from the noncoding regions of SEQ ID NO:61 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[4271] The 33521 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 33521 probes can be used to identify tissue by species and/or by organ type.

[4272] In a similar fashion, these reagents, e.g., 33521 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[4273] Predictive Medicine of 33521

[4274] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[4275] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 33521.

[4276] Such disorders include, e.g., a disorder associated with the misexpression of 33521 gene; a disorder of cell proliferation, e.g., cancer, or a disorder of cell motility, e.g., metastatic cancer.

[4277] The method includes one or more of the following:

[4278] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 33521 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[4279] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 33521 gene;

[4280] detecting, in a tissue of the subject, the misexpression of the 33521 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[4281] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 33521 polypeptide.

[4282] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 33521 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[4283] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:61, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 33521 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[4284] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 33521 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 33521.

[4285] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[4286] In preferred embodiments the method includes determining the structure of a 33521 gene, an abnormal structure being indicative of risk for the disorder.

[4287] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 33521 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[4288] Diagnostic and Prognostic Assays of 33521

[4289] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 33521 molecules and for identifying variations and mutations in the sequence of 33521 molecules.

[4290] Expression Monitoring and Profiling:

[4291] The presence, level, or absence of 33521 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 33521 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 33521 protein such that the presence of 33521 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 33521 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 33521 genes; measuring the amount of protein encoded by the 33521 genes; or measuring the activity of the protein encoded by the 33521 genes.

[4292] The level of mRNA corresponding to the 33521 gene in a cell can be determined both by in situ and by in vitro formats.

[4293] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 33521 nucleic acid, such as the nucleic acid of SEQ ID NO:61, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 33521 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[4294] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 33521 genes.

[4295] The level of mRNA in a sample that is encoded by one of 33521 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[4296] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 33521 gene being analyzed.

[4297] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 33521 mRNA, or genomic DNA, and comparing the presence of 33521 mRNA or genomic DNA in the control sample with the presence of 33521 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 33521 transcript levels.

[4298] A variety of methods can be used to determine the level of protein encoded by 33521. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)₂) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[4299] The detection methods can be used to detect 33521 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 33521 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 33521 protein include introducing into a subject a labeled anti-33521 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-33521 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[4300] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 33521 protein, and comparing the presence of 33521 protein in the control sample with the presence of 33521 protein in the test sample.

[4301] The invention also includes kits for detecting the presence of 33521 in a biological sample. For example, the kit can include a compound or agent capable of detecting 33521 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 33521 protein or nucleic acid.

[4302] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[4303] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[4304] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 33521 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as pain or deregulated cell proliferation.

[4305] In one embodiment, a disease or disorder associated with aberrant or unwanted 33521 expression or activity is identified. A test sample is obtained from a subject and 33521 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 33521 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 33521 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[4306] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 33521 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a cell proliferation and differentiation disorder, e.g., cancer and metastasis.

[4307] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 33521 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 33521 (e.g., other genes associated with a 33521-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[4308] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 33521 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a disorder in a subject wherein a change in 33521 expression is an indication that the subject has or is disposed to having a disorder. The method can be used to monitor a treatment for a disorder in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[4309] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 33521 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[4310] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 33521 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[4311] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[4312] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 33521 expression.

[4313] 33521 Arrays and Uses Thereof

[4314] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 33521 molecule (e.g., a 33521 nucleic acid or a 33521 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm², and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[4315] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 33521 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 33521. Each address of the subset can include a capture probe that hybridizes to a different region of a 33521 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 33521 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 33521 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 33521 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[4316] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[4317] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 33521 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 33521 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-33521 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[4318] In another aspect, the invention features a method of analyzing the expression of 33521. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 33521-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[4319] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 33521. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 33521. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[4320] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 33521 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[4321] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[4322] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 33521-associated disease or disorder; and processes, such as a cellular transformation associated with a 33521-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 33521-associated disease or disorder

[4323] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 33521) that could serve as a molecular target for diagnosis or therapeutic intervention.

[4324] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 33521 polypeptide or fragment thereof. Methods of producing polypeptide arrays are-described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 33521 polypeptide or fragment thereof. For example, multiple variants of a 33521 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[4325] The polypeptide array can be used to detect a 33521 binding compound, e.g., an antibody in a sample from a subject with specificity for a 33521 polypeptide or the presence of a 33521-binding protein or ligand.

[4326] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 33521 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[4327] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 33521 or from a cell or subject in which a 33521 mediated response has been elicited, e.g., by contact of the cell with 33521 nucleic acid or protein, or administration to the cell or subject 33521 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 33521 (or does not express as highly as in the case of the 33521 positive plurality of capture probes) or from a cell or subject which in which a 33521 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 33521 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[4328] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 33521 or from a cell or subject in which a 33521-mediated response has been elicited, e.g., by contact of the cell with 33521 nucleic acid or protein, or administration to the cell or subject 33521 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 33521 (or does not express as highly as in the case of the 33521 positive plurality of capture probes) or from a cell or subject which in which a 33521 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[4329] In another aspect, the invention features a method of analyzing 33521, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 33521 nucleic acid or amino acid sequence; comparing the 33521 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 33521.

[4330] Detection of 33521 Variations or Mutations

[4331] The methods of the invention can also be used to detect genetic alterations in a 33521 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 33521 protein activity or nucleic acid expression, such as a cell proliferation and differentiation disorder, e.g., cancer and metastasis. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 33521-protein, or the mis-expression of the 33521 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 33521 gene; 2) an addition of one or more nucleotides to a 33521 gene; 3) a substitution of one or more nucleotides of a 33521 gene, 4) a chromosomal rearrangement of a 33521 gene; 5) an alteration in the level of a messenger RNA transcript of a 33521 gene, 6) aberrant modification of a 33521 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 33521 gene, 8) a non-wild type level of a 33521-protein, 9) allelic loss of a 33521 gene, and 10) inappropriate post-translational modification of a 33521-protein.

[4332] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 33521-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 33521 gene under conditions such that hybridization and amplification of the 33521-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[4333] In another embodiment, mutations in a 33521 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[4334] In other embodiments, genetic mutations in 33521 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 33521 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 33521 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 33521 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[4335] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 33521 gene and detect mutations by comparing the sequence of the sample 33521 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[4336] Other methods for detecting mutations in the 33521 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[4337] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 33521 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[4338] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 33521 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 33521 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[4339] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[4340] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[4341] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[4342] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 33521 nucleic acid.

[4343] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:61 or the complement of SEQ ID NO:61. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[4344] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 33521. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[4345] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the T_(m) of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[4346] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 33521 nucleic acid.

[4347] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 33521 gene.

[4348] Use of 33521 Molecules as Surrogate Markers

[4349] The 33521 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 33521 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 33521 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[4350] The 33521 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 33521 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-33521 antibodies may be employed in an immune-based detection system for a 33521 protein marker, or 33521-specific radiolabeled probes may be used to detect a 33521 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[4351] The 33521 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 33521 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 33521 DNA may correlate 33521 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[4352] Pharmaceutical Compositions of 33521

[4353] The nucleic acid and polypeptides, fragments thereof, as well as anti-33521 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[4354] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[4355] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifingal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[4356] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[4357] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[4358] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[4359] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[4360] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[4361] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[4362] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[4363] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[4364] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[4365] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[4366] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[4367] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e.,. including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[4368] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[4369] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, CC-1065, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine, vinblastine, taxol and maytansinoids). Radioactive ions include, but are not limited to iodine, yttrium and praseodymium.

[4370] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, α-interferon, β-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[4371] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[4372] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[4373] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[4374] Methods of Treatment for 33521

[4375] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 33521 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[4376] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 33521 molecules of the present invention or 33521 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[4377] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 33521 expression or activity, by administering to the subject a 33521 or an agent which modulates 33521 expression or at least one 33521 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 33521 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 33521 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 33521 aberrance, for example, a 33521, 33521 agonist or 33521 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[4378] It is possible that some 33521 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[4379] Aberrant expression and/or activity of 33521 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 33521 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 33521 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 33521 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[4380] The 33521 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[4381] Examples of disorders involving the heart or “cardiovascular disorder” include, but are not limited to, a disease, disorder, or state involving the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. Examples of such disorders include hypertension, atherosclerosis, coronary artery spasm, congestive heart failure, coronary artery disease, valvular disease, arrhythmias, and cardiomyopathies.

[4382] Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[4383] Additionally, 33521 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 33521 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 33521 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[4384] Additionally, 33521 may play an important role in the regulation of metabolism or pain disorders. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[4385] As discussed, successful treatment of 33521 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 33521 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)₂ and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[4386] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[4387] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[4388] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 33521 expression is through the use of aptamer molecules specific for 33521 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 33521 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[4389] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 33521 disorders. For a description of antibodies, see the Antibody section above.

[4390] In circumstances wherein injection of an animal or a human subject with a 33521 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 33521 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 33521 protein. Vaccines directed to a disease characterized by 33521 expression may also be generated in this fashion.

[4391] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[4392] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 33521 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[4393] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED₅₀ with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC₅₀ (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography. Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 33521 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 33521 can be readily monitored and used in calculations of IC₅₀.

[4394] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC₅₀. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[4395] Another aspect of the invention pertains to methods of modulating 33521 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 33521 or agent that modulates one or more of the activities of 33521 protein activity associated with the cell. An agent that modulates 33521 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 33521 protein (e.g., a 33521 substrate or receptor), a 33521 antibody, a 33521 agonist or antagonist, a peptidomimetic of a 33521 agonist or antagonist, or other small molecule.

[4396] In one embodiment, the agent stimulates one or 33521 activities. Examples of such stimulatory agents include active 33521 protein and a nucleic acid molecule encoding 33521. In another embodiment, the agent inhibits one or more 33521 activities. Examples of such inhibitory agents include antisense 33521 nucleic acid molecules, anti-33521 antibodies, and 33521 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 33521 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 33521 expression or activity. In another embodiment, the method involves administering a 33521 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 33521 expression or activity.

[4397] Stimulation of 33521 activity is desirable in situations in which 33521 is abnormally downregulated and/or in which increased 33521 activity is likely to have a beneficial effect. For example, stimulation of 33521 activity is desirable in situations in which a 33521 is downregulated and/or in which increased 33521 activity is likely to have a beneficial effect. Likewise, inhibition of 33521 activity is desirable in situations in which 33521 is abnormally upregulated and/or in which decreased 33521 activity is likely to have a beneficial effect.

[4398] 33521 Pharmacogenomics

[4399] The 33521 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 33521 activity (e.g., 33521 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 33521 associated disorders (e.g., cell proliferation and differentiation disorder, e.g., cancer and metastasis) associated with aberrant or unwanted 33521 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 33521 molecule or 33521 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 33521 molecule or 33521 modulator.

[4400] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[4401] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[4402] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 33521 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[4403] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 33521 molecule or 33521 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[4404] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 33521 molecule or 33521 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[4405] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 33521 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 33521 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[4406] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 33521 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 33521 gene expression, protein levels, or upregulate 33521 activity, can be monitored in clinical trials of subjects exhibiting decreased 33521 gene expression, protein levels, or downregulated 33521 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 33521 gene expression, protein levels, or downregulate 33521 activity, can be monitored in clinical trials of subjects exhibiting increased 33521 gene expression, protein levels, or upregulated 33521 activity. In such clinical trials, the expression or activity of a 33521 gene, and preferably, other genes that have been implicated in, for example, a 33521-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[4407] 33521 Informatics

[4408] The sequence of a 33521 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 33521. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 33521 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[4409] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[4410] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[4411] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[4412] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[4413] Thus, in one aspect, the invention features a method of analyzing 33521, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 33521 nucleic acid or amino acid sequence; comparing the 33521 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 33521. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[4414] The method can include evaluating the sequence identity between a 33521 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[4415] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[4416] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[4417] Thus, the invention features a method of making a computer readable record of a sequence of a 33521 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[4418] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 33521 sequence, or record, in machine-readable form; comparing a second sequence to the 33521 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 33521 sequence includes a sequence being compared. In a preferred embodiment the 33521 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 33521 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[4419] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 33521-associated disease or disorder or a pre-disposition to a 33521-associated disease or disorder, wherein the method comprises the steps of determining 33521 sequence information associated with the subject and based on the 33521 sequence information, determining whether the subject has a 33521-associated disease or disorder or a pre-disposition to a 33521-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[4420] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 33521-associated disease or disorder or a pre-disposition to a disease associated with a 33521 wherein the method comprises the steps of determining 33521 sequence information associated with the subject, and based on the 33521 sequence information, determining whether the subject has a 33521-associated disease or disorder or a pre-disposition to a 33521-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 33521 sequence of the subject to the 33521 sequences in the database to thereby determine whether the subject as a 33521-associated disease or disorder, or a pre-disposition for such.

[4421] The present invention also provides in a network, a method for determining whether a subject has a 33521 associated disease or disorder or a pre-disposition to a 33521-associated disease or disorder associated with 33521, said method comprising the steps of receiving 33521 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 33521 and/or corresponding to a 33521-associated disease or disorder (e.g., cell proliferation and differentiation disorder, e.g., cancer and metastasis), and based on one or more of the phenotypic information, the 33521 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 33521-associated disease or disorder or a pre-disposition to a 33521-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[4422] The present invention also provides a method for determining whether a subject has a 33521-associated disease or disorder or a pre-disposition to a 33521-associated disease or disorder, said method comprising the steps of receiving information related to 33521 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 33521 and/or related to a 33521-associated disease or disorder, and based on one or more of the phenotypic information, the 33521 information, and the acquired information, determining whether the subject has a 33521-associated disease or disorder or a pre-disposition to a 33521-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[4423] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 23479, 48120, and 46689 Invention

[4424] Hydrolases are a large class of enzymes that use a water molecule to catalyze the cleavage of a chemical bond. Hydrolases play important roles in the synthesis and breakdown of almost all major metabolic intermediates, including polypeptides, nucleic acids, and lipids.

[4425] One family of hydrolases consists of the ubiquitin carboxy-terminal hydrolases. The ubiquitin pathway is one example of a post-translational mechanism used to regulate protein levels. Ubiquitin is a highly conserved polypeptide expressed in all eukaryotic cells that marks proteins for degradation. Ubiquitin is attached as a single molecule or as a conjugated form to lysine residue(s) of proteins via formation of an isopeptide bond at the C-terminal glycine residue. Most ubiquitinated proteins are subsequently targeted to the 26S proteasome, a multicatalytic protease, which cleaves the marked protein into peptide fragments.

[4426] Only the protein conjugated to ubiquitin is degraded via the proteasome; ubiquitin itself is recycled by ubiquitin carboxy-terminal hydrolases (UCH; sometimes abbreviated UCTH), which cleave the bond between ubiquitin and the protein targeted for degradation. These enzymes constitute a family of thiol proteases, and homologues have been found in, for example, yeast (Miller et al., Bio Technology 7:698-704, 1989; Tobias and Varshavsky, J. Biol. Chem. 266:12021-12028, 1991; Baker et al., J. Biol. Chem. 267:23364-23375, 1992), bovine (Papa and Hochstrasser, Nature 366:313-319, 1993), avian (Woo et al., J. Biol. Chem. 270:18766-18773, 1995), Drosophila (Zhang et al., Dev. Biol. 17:214, 1993) and human (Wilkinson et al., Science 246:670, 1989) cells.

[4427] Another family of hydorlases consists of α/β hydrolases. The α/β hydrolase family of enzymes is a phylogenetically diverse group of enzymes that share a common fold, typically comprising an eight-stranded J-sheet surrounded by 1-helices (Ollis, D. et al. (1992) Protein Eng 5:197-211; Nardini and Dikkstra (1999) Curr Opin Str Bio 9:732-737). Members of the α/β hydrolase family are found in nearly all organisms, from microbes to plants to humans. Enzymes possessing the α/β hydrolase fold diverged from a common ancestor, but have preserved a catalytic mechanism that utilizes a triad of residues consisting of a nucleophile, an acidic residue, and a histidine residue (Ollis, D. et al. (1992) Protein Eng. 5:197-211). Although only the histidine residue is invariant, the other two residues in the triad are limited in terms of the amino acid residues that are functionally acceptable. Thus, the nucleophile is usually a serine residue, but can also be an aspartate or a cysteine residue, while the acidic residue is either an aspartic or glutamic acid residue (Schrag, J. et al. (1997) Meth Enzymol 284:85-107). The relative order of the three catalytic residues in the amino acid chain is always the same: nucleophile, acid, histidine.

[4428] Members of the α/β hydrolase family of enzymes include enzymes that hydrolyze ester bonds (e.g., phosphatases, sulfatases, exonucleases, and endonucleases), glycosidases, enzymes that act on ether bonds, peptidases (e.g., exopeptidases and endopeptidases), as well as enzymes that hydrolyze carbon-nitrogen bonds, acid anhydrides, carbon-carbon bonds, halide bonds, phosphorous-nitrogen bonds, sulfur-nitrogen bonds, carbon-phosphorous bonds, and sulfur-sulfur bonds (E. C. Webb ed., Enzyme Nomenclature, pp. 306-450, © 1992 Academic Press, Inc. San Diego, Calif.). Some specific biological activities of these enzymes include lipase activity, e.g., fungal, bacterial and pancreatic lipases, acetylcholinesterase activity, serine carboxypeptidase activity, prolyl aminopeptidase activity, haloalkane dehalogenase activity, dienelactone hydrolase activity, A2 bromoperoxidase activity, and thioesterase activity (Schrag, J. et al., supra). Acetylcholinesterases, epoxide hydrolases, cholesterol esterases, and lipases have particular medical importance. Inhibitors of acetylcholinesterase are useful therapeutic agents for the treatment of Alzheimer's disease, myasthenia gravis, and glaucoma; epoxide hydrolases and dienelactone hydrolases detoxify harmful aromatic compounds in mammals; and the human hormone sensitive lipase catalyzes the rate-limiting reaction of fat hydrolysis in adipocytes.

[4429] Given the important role that hydrolases play in the synthesis and breakdown of metabolic intermediates, including polypeptides, nucleic acids, and lipids, it is not surprising that their activity significantly impacts the activity of the cell. For example, hydrolases contribute to the growth and differentiation of the cell, to cellular proliferation, adhesion, and motility, and to the interaction and communication that takes place between cells. In addition, hydrolases are important in the conversion of pro-proteins and pro-hormones to their active forms, the inactivation of peptides, the biotransformation of compounds (e.g., a toxin or carcinogen), antigen presentation, and the regulation of synaptic transmission.

Summary of the 23479, 48120, and 46689 Invention

[4430] The present invention is based, in part, on the discovery of novel hydorlase molecules, referred to herein as “23479, 48120, and 46689”. The nucleotide sequence of cDNAs encoding 23479, 48120, and 46689 are recited in SEQ ID NO:74, SEQ ID NO:77, and SEQ ID NO:80, respectively, and the amino acid sequences of 23479, 48120, and 46689 polypeptides are recited in SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81, respectively. In addition, the nucleotide sequences of the coding regions are recited in SEQ ID NO:76, SEQ ID NO:79, and SEQ ID NO:82, respectively.

[4431] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 23479, 48120, or 46689 protein or polypeptide, e.g., a biologically active portion of the 23479, 48120, or 46689 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81. In other embodiments, the invention provides isolated 23479, 48120, or 46689 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:82, or the sequence of a DNA insert of the plasmids deposited with ATCC Accession Numbers as described herein. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:82, or the sequence of a DNA insert of the plasmids deposited with ATCC Accession Numbers as described herein. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:82, or the sequence of a DNA insert of the plasmids deposited with ATCC Accession Numbers as described herein, wherein the nucleic acid encodes a full length 23479, 48120, or 46689 protein or an active fragment thereof.

[4432] In a related aspect, the invention further provides nucleic acid constructs that include a 23479, 48120, or 46689 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 23479, 48120, or 46689 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 23479, 48120, or 46689 nucleic acid molecules and polypeptides.

[4433] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 23479, 48120, or 46689-encoding nucleic acids.

[4434] In still another related aspect, isolated nucleic acid molecules that are antisense to a 23479, 48120, or 46689-encoding nucleic acid molecule are provided.

[4435] In another aspect, the invention features, 23479, 48120, or 46689 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 23479, 48120, or 46689-mediated or -related disorders. In another embodiment, the invention provides 23479, 48120, or 46689 polypeptides having a 23479, 48120, or 46689 activity. Preferred polypeptides are 23479, 48120, or 46689 proteins including at least one hydrolase domain, and, preferably, having a 23479, 48120, or 46689 activity, e.g., a 23479, 48120, or 46689 activity as described herein.

[4436] In other embodiments, the invention provides 23479, 48120, or 46689 polypeptides, e.g., a 23479, 48120, or 46689 polypeptide having the amino acid sequence shown in SEQ ID NO:75, SEQ ID NO:78, SEQ ID NO:81, or an amino acid sequence encoded by a cDNA insert of one of the plasmids deposited with ATCC Accession Number as described herein; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:75, SEQ ID NO:78, SEQ ID NO:81, or an amino acid sequence encoded by a cDNA insert of one of the plasmids deposited with ATCC Accession Number as described herein; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:82, or the sequence of a DNA insert of the plasmids deposited with ATCC Accession Numbers as described herein, wherein the nucleic acid encodes a full length 23479, 48120, or 46689 protein or an active fragment thereof.

[4437] In a related aspect, the invention provides 23479, 48120, or 46689 polypeptides or fragments operatively linked to non-23479, 48120, or 46689 polypeptides to form fusion proteins.

[4438] In another aspect, the invention features antibodies and antigen-binding fragments thereof that react with or, more preferably, specifically bind 23479, 48120, or 46689 polypeptides or fragments thereof, e.g., a hydrolase domain of a 23479, 48120, or 46689 polypeptide.

[4439] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 23479, 48120, or 46689 polypeptides or nucleic acids.

[4440] In still another aspect, the invention provides a process for modulating 23479, 48120, or 46689 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 23479, 48120, or 46689 polypeptides or nucleic acids, such as conditions involving aberrant or deficient cellular proliferation or differentiation.

[4441] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing, of a 23479, 48120, or 46689-expressing cell, e.g., a hyper-proliferative 23479, 48120, or 46689-expressing cell. The method includes contacting the cell with an agent, e.g., a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 23479, 48120, or 46689 polypeptide or nucleic acid. In a preferred embodiment, the contacting step occurs in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol.

[4442] In one embodiment, the cell is a hyperproliferative cell, e.g., a cell found in a solid tumor, a soft tissue tumor, or a metastatic lesion, e.g., a tumor or metastatic lesion of the lung, brain, ovary or breast.

[4443] In other embodiments, the cell is a neuron or a glial cell, e.g., a cortical or hypothalamic cell. In yet other embodiments, the cell is a cardiovascular cell, e.g., a heart- or blood vessel-associated cell.

[4444] In a preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 23479, 48120, or 46689 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another embodiment, the agent, e.g., the compound, is an inhibitor of a 23479, 48120, or 46689 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[4445] In a preferred embodiment, the agent, e.g., the compound, is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[4446] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant cellular proliferation or differentiation of a 23479, 48120, or 46689-expressing cell, in a subject. Preferably, the method includes administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 23479, 48120, or 46689 polypeptide or nucleic acid.

[4447] The disorder can be a cancerous or pre-cancerous condition, e.g., a solid tumor, a soft tissue tumor, or a metastatic lesion, e.g., a tumor or metastatic lesion of the lung, brain, ovary or breast. In other embodiments, the disorder is a neurological or a cardiovascular disorder.

[4448] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., a cellular proliferative or differentiative disorder, a neurological or a cardiovascular disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 23479, 48120, or 46689 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 23479, 48120, or 46689 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 23479, 48120, or 46689 nucleic acid or polypeptide expression can be detected by any method described herein.

[4449] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 23479, 48120, or 46689 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[4450] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 23479, 48120, or 46689 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 23479, 48120, or 46689 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 23479, 48120, or 46689 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue or a lung, brain, ovary, or breast tissue.

[4451] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 23479, 48120, or 46689 polypeptide or nucleic acid molecule, including for disease diagnosis.

[4452] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 23479, 48120, or 46689 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 23479, 48120, or 46689 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 23479, 48120, or 46689 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[4453] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 23479, 48120, and 46689

[4454] Human 23479

[4455] The human 23479 sequence (see SEQ ID NO:74, as recited in Example 48), which is approximately 3494 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 2805 nucleotides, including the termination codon. The coding sequence encodes a 934 amino acid protein (see SEQ ID NO:75, as recited in Example 48).

[4456] Human 23479 contains the following regions or structural features:

[4457] a ubiquitin carboxyl-terminal hydrolase-1 (UCH-1) domain (FIG. 34A; PFAM Accession PF00442) located at about amino acid residues 296-327 of SEQ ID NO:75;

[4458] a ubiquitin carboxyl-terminal hydrolase-2 (UCH-2) domain (FIG. 34B; PFAM Accession PF00443) located at about amino acid residues 546-640 of SEQ ID NO:75;

[4459] two predicted N-glycosylation sites (PS00001) located at about amino acid residues 94-97 and 739-742 of SEQ ID NO:75;

[4460] one predicted cAMP and cGMP-dependent protein kinase phosphorylation site (PS00004) located at about amino acid residues 60-63 of SEQ ID NO:75;

[4461] twelve predicted Protein Kinase C phosphorylation sites (PS00005) located at about amino acid residues 228-230, 241-243, 326-328, 402-404, 432-434, 451-453, 490-492, 529-531, 611-613, 619-621, 706-708, and 932-934 of SEQ ID NO:75;

[4462] sixteen predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino acid residues 23-26, 48-51, 85-88, 156-159, 228-231, 338-341, 390-393, 426-429, 446-449, 451-454, 617-620, 695-698, 808-811, 890-893, 905-908, and 930-933 of SEQ ID NO:75;

[4463] three predicted tyrosine kinase phosphorylation sites (PS00007) located at about amino acid residues 22-30, 543-550, and 674-681 of SEQ ID NO:75;

[4464] seven predicted N-myristoylation sites (PS00008) located at about amino acid residues 86-91, 256-261, 408-413, 560-565, 607-612, 798-803, and 814-819 of SEQ ID NO:75;

[4465] one predicted amidation site (PS00009) located at about amino acid residues 467-470 of SEQ ID NO:75;

[4466] one predicted ubiquitin carboxyl-terminal hydrolase family 2 signature 2 (PS00973) located at about amino acid residues 550-567 of SEQ ID NO:75;

[4467] one peroxisomal targeting signal located at about amino acid residues 785-793 of SEQ ID NO:75; and

[4468] one predicted coiled coil domain located at about amino acid residues 884-911 of SEQ ID NO:75.

[4469] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[4470] A plasmid containing the nucleotide sequence encoding human 23479 (clone “Fbh23479FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[4471] Human 48120

[4472] The human 48120 sequence (see SEQ ID NO:77, as recited in Example 48), which is approximately 4873 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 3420 nucleotides, including the termination codon. The coding sequence encodes a 1139 amino acid protein (see SEQ ID NO:78, as recited in Example 48).

[4473] Human 48120 contains the following regions or structural features:

[4474] a ubiquitin carboxyl-terminal hydrolase-1 (UCH-1) domain (FIG. 36A; PFAM Accession PF00442) located at about amino acid residues 162-193 of SEQ ID NO:78;

[4475] a ubiquitin carboxyl-terminal hydrolase-2 (UCH-2) domain (FIG. 36B; PFAM Accession PF00443) located at about amino acid residues 580-649 of SEQ ID NO:78;

[4476] a ubiquitin associated (UBA) domain (FIG. 36C; PFAM Accession PF00627) located at about amino acid residues 20-61 of SEQ ID NO:78;

[4477] a ubiquitin interaction motif (UIM) domain (FIG. 36D; PFAM Accession PF02809) located at about amino acid residues 96-113 of SEQ ID NO:78;

[4478] six predicted N-glycosylation sites (PS00001) located at about amino acid residues 282-285, 310-313, 373-376, 639-642, 711-714, and 916-919 of SEQ ID NO:78;

[4479] one predicted cAMP and cGMP-dependent protein kinase phosphorylation site (PS00004) located at about amino acid residues 958-961 of SEQ ID NO:78;

[4480] fifteen predicted Protein Kinase C phosphorylation sites (PS00005) located at about amino acid residues 113-115, 134-136, 137-139, 207-209, 228-230, 260-262, 279-281, 347-349, 453-455, 484-486, 517-519, 700-702, 753-755, 1110-1112, and 1137-1139 of SEQ ID NO:78;

[4481] thirty-three predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino acid residues 47-50, 76-79, 109-112, 130-133, 205-208, 248-251, 368-371, 479-482, 484-487, 489-492, 494-497, 503-506, 520-523, 532-535, 550-553, 620-623, 624-627, 662-665, 668-671, 713-716, 719-722, 760-763, 808-811, 822-825, 881-884, 907-910, 918-921,966-969, 1024-1027, 1028-1031, 1033-1036, 1058-1061, 1115-1118 of SEQ ID NO:78;

[4482] one predicted tyrosine kinase phosphorylation site (PS00007) located at about amino acid residues 975-982 of SEQ ID NO:78;

[4483] thirteen predicted N-myristoylation sites (PS00008) located at about amino acid residues 12-17, 80-85, 244-249, 294-299, 300-305, 433-438, 594-599, 635-640, 761-766, 839-844, 855-860, 1001-1006, and 1077-1082 of SEQ ID NO:78;

[4484] one predicted carbamoyl-phosphate synthase subdomain signature 2 (PS00867) located at about amino acid residues 1015-1022 of SEQ ID NO:78;

[4485] one predicted ubiquitin carboxyl-terminal hydrolase family 2 signature 2 (PS00973) located at about amino acid residues 584-601 of SEQ ID NO:78; and

[4486] one predicted coiled coil domain located at about amino acid residues 399-431 of SEQ ID NO:78.

[4487] A plasmid containing the nucleotide sequence encoding human 48120 (clone “Fbh48120FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. § 112.

[4488] Human 46689

[4489] The human 46689 sequence (see SEQ ID NO:80, as recited in Example 48), which is approximately 2082 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 1407 nucleotides, including the termination codon. The coding sequence encodes a 468 amino acid protein (see SEQ ID NO:81, as recited in Example 48). The human 46689 protein of SEQ ID NO:81 and FIG. 34 includes an amino-terminal hydrophobic amino acid sequence, consistent with a signal sequence, of about 26 amino acid residues (from amino acid 1 to about amino acid 26 of SEQ ID NO:81), which upon cleavage results in the production of a mature protein form. This mature protein form is approximately 442 amino acid residues in length (from about amino acid residues 27 to 468 of SEQ ID NO:81).

[4490] Human 46689 contains the following regions or other structural features:

[4491] an α/β hydrolase domain (PFAM Accession Number PF00561) located at about amino acid residues 186 to 419 of SEQ ID NO:81;

[4492] a predicted catalytic acid residue, located at about amino acid residue 360 of SEQ ID NO:81;

[4493] a predicted catalytic histidine residue, located about amino acid residue 391 of SEQ ID NO:81;

[4494] a predicted signal peptide located at about amino acid residues 1 to 26 of SEQ ID NO:81;

[4495] one predicted transmembrane domain located at about amino acid residues 150 to 167 of SEQ ID NO:81;

[4496] four predicted Protein Kinase C phosphorylation sites (PS00005) located at about amino acid residues 203 to 205, 305 to 307, 313 to 315, and 411 to 413 of SEQ ID NO:81;

[4497] four predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino acid residues 297 to 300, 313 to 316, 357 to 360, and 454 to 457 of SEQ ID NO:81;

[4498] one predicted cAMP/cGMP-dependent protein kinase phosphorylation sites (PS00004) located at about amino acid residues 148 to 151 of SEQ ID NO:81;

[4499] two predicted amidation sites (PS00009) located at about amino acid residues 146 to 149, and 437 to 440 of SEQ ID NO:81; and

[4500] five predicted N-myristylation sites (PS00008) located at about amino acid residues 5 to 10, 52 to 57, 154 to 159, 237 to 242, and 389 to 394 of SEQ ID NO:81.

[4501] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[4502] A plasmid containing the nucleotide sequence encoding human 46689 (clone “Fbh46689FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112. TABLE 18 Summary of Sequence Information for 23479, 48120, and 46689 ATCC Poly- Accession Gene cDNA ORF peptide Figure Number 23479 SEQ ID SEQ ID SEQ ID FIG. 33, NO: 74 NO: 76 NO: 75 34A-34B 48120 SEQ ID SEQ ID SEQ ID FIG. 35, NO: 77 NO: 79 NO: 78 36A-36D 46689 SEQ ID SEQ ID SEQ ID FIG. 37, NO: 80 NO: 82 NO: 81 38, 39

[4503] 23479 and 48120 Polypeptides

[4504] The 23479 and 48120 proteins contain a significant number of structural characteristics in common with members of the ubiquitin carboxyl-terminal hydrolase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[4505] Members of the ubiquitin carboxyl-terminal hydrolase family of proteins are characterized by a “ubiquitin carboxyl-terminal hydrolase domain.” The term “ubiquitin carboxyl-terminal hydrolase domain” refers to an amino acid sequence that participates in the removal of one or more ubiquitin molecules from a protein that has one or more molecules of ubiquitin attached to it. The term also includes amino acid sequences that cleave conjugated forms of ubiquitin (e.g., in a head to tail orientation linked via a peptide bond) whether or not the ubiquitin conjugate is attached to a protein. For example, a ubiquitin-ubiquitin conjugate (dimer) could be cleaved into monomers, a tri-ubiquitin conjugate could be cleaved into three monomers, or a dimer and a single monomer. In either of these particular examples, the monomer or dimer could remain attached to or be cleaved from the ubiquitinated protein.

[4506] Ubiquitin carboxyl-terminal hydrolases typically contain two conserved regions, a UCH-1 domain and a UCH-2 domain, each of which is thought to participate in the catalytic mechanism. The conserved signature patterns of UCH-1 and UCH-2 are respectively as follows: (1)G-[LIVMFY]-x(1,3)-[AGC]-[NASM]-x-C-[FYW]-[LIVMFC]-[NST]-[SACV]-x-[LIVMS]-Q; and (2)Y-x-L-x-[SAG]-[LIVMFT]-x(2)-H-x-G-x(4,5)-G-H-Y (SEQ ID NO:89). 23479 and 48120 proteins preferably contain one or more sequences that conform to this signature pattern.

[4507] A 23479 or 48120 polypeptide can include a “UCH-1 domain” or regions homologous with a “UCH-1 domain.”

[4508] As used herein, the term “UCH-1 domain” includes an amino acid sequence of about 10 to 100 amino acid residues in length and having a bit score for the alignment of the sequence to the UCH-1 domain (HMM) of at least 25. 23479 or 48120 proteins preferably contain sequences that conform to UCH-1 signature pattern described above. Preferably, a 23479 protein contains the sequence GLTNLGATCYLASTIQ (SEQ ID NO:90). Preferably, a 48120 protein contains the sequence GLKNVGNTCWFSAVIQ (SEQ ID NO:91). Preferably, a UCH-1 domain includes at least about 20 to 50 amino acids, more preferably about 25 to 40 amino acid residues, or about 30 to 35 amino acids and has a bit score for the alignment of the sequence to the UCH-1 domain (HMM) of at least 45 or greater. The UCH-1 domain has been assigned the PFAM Accession PF00442 (http;//genome.wustl.edu/Pfam/html). Alignments of the UCH-1 domains of human 23479 and 48120 with consensus amino acid sequences derived from hidden Markov models are depicted in FIG. 34A (23479; amino acids 296 to 327 of SEQ ID NO:75) and FIG. 36A (48120; amino acids 162 to 193 of SEQ ID NO:78).

[4509] In a preferred embodiment 23479 or 48120 polypeptide or protein has a “UCH-1 domain” or a region which includes at least about 20 to 50 more preferably about 25 to 40 or 30 to 35 amino acid residues and has at least about 70% 80% 90% 95%, 99%, or 100% homology with a “UCH-1 domain,” e.g., a UCH-1 domain of human 23479 or 48120, e.g., residues 296 to 327 of SEQ ID NO:75 or residues 162 to 193 of SEQ ID NO:78.

[4510] A 23479 or 48120 polypeptide can further include a “UCH-2 domain” or regions homologous with a “UCH-2 domain.”

[4511] As used herein, the term “UCH-2 domain” includes an amino acid sequence of about 10 to 150 amino acid residues in length and having a bit score for the alignment of the sequence to the UCH-2 domain (HMM) of at least 50. 23479 or 48120 proteins preferably contain sequences that conform to UCH-2 signature pattern described above. Preferably, a 23479 protein contains the sequence YDLIGVTVHTGTADGGHY (SEQ ID NO:92). Preferably, a 48120 protein contains the sequence YRLHAVLVHEGQANAGHY (SEQ ID NO:93). Preferably, a UCH-2 domain includes at least about 30 to 125 amino acids, more preferably about 50 to 110 amino acid residues, or about 60 to 100 amino acids and has a bit score for the alignment of the sequence to the UCH-2 domain (HMM) of at least 75 or greater. The UCH-2 domain has been assigned the PFAM Accession PF00443 (http;//genome.wustl.edu/Pfam/.html). Alignments of the UCH-2 domains of human 23479 and 48120 with consensus amino acid sequences derived from hidden Markov models are depicted in FIG. 34B (23479; amino acids 546 to 640 of SEQ ID NO:75) and FIG. 36B (48120; amino acids 580 to 649 of SEQ ID NO:78).

[4512] In a preferred embodiment 23479 or 48120 polypeptide or protein has a “UCH-2 domain” or a region which includes at least about 30 to 125 more preferably about 50 to 110 or 60 to 100 amino acid residues and has at least about 70% 80% 90% 95%, 99%, or 100% homology with a “UCH-2 domain,” e.g., a UCH-2 domain of human 23479 or 48120, e.g., residues 546 to 640 of SEQ ID NO:75 or residues 580 to 649 of SEQ ID NO:78.

[4513] A 48120 polypeptide can also include a “UBA domain” or regions homologous with a “UBA domain.” A “UBA domain” refers to a commonly occurring amino acid sequence found in several proteins having connections to ubiquitin and the ubiquitination pathway. The structure of the UBA domain consists of a compact three helix bundle.

[4514] As used herein, the term “UBA domain” includes an amino acid sequence of about 10 to 100 amino acid residues in length and having a bit score for the alignment of the sequence to the UBA domain (HMM) of at least 5. Preferably, a UBA domain includes at least about 20 to 80 amino acids, more preferably about 25 to 60 amino acid residues, or about 35 to 45 amino acids and has a bit score for the alignment of the sequence to the UBA domain (HMM) of at least 8 or greater. The UBA domain (HMM) has been assigned the PFAM Accession Number PF00627 (http://genome.wustl.edu/Pfam/html). An alignment of the UBA domain (amino acids 20-61 of SEQ ID NO:78) of human 48120 with a consensus amino acid sequence derived from a hidden Markov model is depicted in FIG. 36C.

[4515] In a preferred embodiment a 48120 polypeptide or protein has a “UBA domain” or a region which includes at least about 20 to 80 more preferably about 25 to 60 or 35 to 45 amino acid residues and has at least about 70% 80% 90% 95%, 99%, or 100% homology with a “UBA domain,” e.g., a UBA domain of human 48120, e.g., residues 20 to 61 of SEQ ID NO:78.

[4516] A 48120 polypeptide can also include a “ubiquitin interaction motif (UIM) domain” or regions homologous with a “UIM domain.”

[4517] As used herein, the term “UIM domain” includes an amino acid sequence of about 10 to 50 amino acid residues in length and having a bit score for the alignment of the sequence to the UBA domain (HMM) of at least 5. Preferably, a UIM domain includes at least about 10 to 40 amino acids, more preferably about 10 to 30 amino acid residues, or about 15 to 20 amino acids and has a bit score for the alignment of the sequence to the UIM domain (HMM) of at least 10 or greater. The UIM domain (HMM) has been assigned the PFAM Accession Number PF02809 (http://genome.wustl.edu/Pfam/html). An alignment of the UIM domain (amino acids 96-113 of SEQ ID NO:78) of human 48120 with a consensus amino acid sequence derived from a hidden Markov model is depicted in FIG. 36D.

[4518] In a preferred embodiment a 48120 polypeptide or protein has a “UIM domain” or a region which includes at least about 10 to 40 more preferably about 10 to 30 or 15 to 20 amino acid residues and has at least about 70% 80% 90% 95%, 99%, or 100% homology with a “UIM domain,” e.g., a UIM domain of human 48120, e.g., residues 96 to 113 of SEQ ID NO:78.

[4519] To identify the presence of a UCH-1, UCH-2, UBA, or UIM domain in a 23479 or 48120 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against the Pfam database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference.

[4520] A 23479 or 48120 family member can include a UCH-1 domain and a UCH-2 domain. A 48120 family member can also include a UBA domain. A 48120 family member can further include a UIM domain.

[4521] A 23479 family member can also include at least one and preferably two N-glycosylation sites (PS00001); at least one cAMP and cGMP-dependent protein kinase phosphorylation site (PS00004); at least one, two, three, four, five, six, seven, eight, nine, 10, 11, and preferably 12 protein kinase C phosphorylation sites (PS00005); at least one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, 13, 14, 15, and preferably 16 casein kinase II phosphorylation sites (PS00006); at least one, two, and preferably three tyrosine kinase phosphorylation sites (PS00007); at least one, two, three, four, five, six, and preferably seven N-myristoylation sites (PS00008); at least one amidation site (PS00009); at least one ubiquitin carboxyl-terminal hydrolase family 2 signature 2 (PS00973); at least one peroxisomal targeting signal; and at least one coiled coil domain.

[4522] A 48120 family member can also include at least one, two, three, four, five, and preferably six N-glycosylation sites (PS00001); at least one cAMP and cGMP-dependent protein kinase phosphorylation site (PS00004); at least one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, 13, 14, and preferably 15 protein kinase C phosphorylation sites (PS00005); at least one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, and preferably 33 casein kinase II phosphorylation sites (PS00006); at least one tyrosine kinase phosphorylation site (PS00007); at least one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, and preferably 13 N-myristoylation sites (PS00008); at least one carbamoyl-phosphate synthase subdomain signature 2 (PS00867); at least one ubiquitin carboxyl-terminal hydrolase family 2 signature 2 (PS00973); and at least one coiled coil domain.

[4523] As the 23479 or 48120 polypeptides of the invention may modulate 23479 or 48120-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 23479 or 48120-mediated or related disorders, as described below.

[4524] As used herein, a “23479 or 48120 activity”, “biological activity of 23479 or 48120” or “functional activity of 23479 or 48120”, refers to an activity exerted by a 23479 or 48120 protein, polypeptide or nucleic acid molecule. For example, a 23479 or 48120 activity can be an activity exerted by 23479 or 48120 in a physiological milieu on, e.g., a 23479 or 48120-responsive cell or on a 23479 or 48120 substrate, e.g., a protein substrate. A 23479 or 48120 activity can be determined in vivo or in vitro. In one embodiment, a 23479 or 48120 activity is a direct activity, such as an association with a 23479 or 48120 target molecule. A “target molecule” or “binding partner” is a molecule with which a 23479 or 48120 protein binds or interacts in nature, e.g., a complex of ubiquitin and a protein targeted for degradation.

[4525] A 23479 or 48120 activity can also be an indirect activity, e.g., an activity mediated by a protein that is a target for de-ubiquitination by 23479 or 48120. The features of the 23479 or 48120 molecules of the present invention can provide similar biological activities as ubiquitin carboxyl-terminal hydrolase family members. For example, the 23479 or 48120 proteins of the present invention can have one or more of the following activities: 1) modulation of de-ubiquitination of a substrate, e.g., a ubiquitinated protein targeted for degradation; 2) participation in the processing of poly-ubiquitin precursors; 3) modulation of cellular proliferation and/or differentiation; 4) modulation of apoptosis; 5) modulation of transcription and/or cell-cycle progression; 6) modulation of signal-transduction; 7) modulation of antigen processing; 8) modulation of cell-cell adhesion; 9) modulation of receptor-mediated endocytosis; 10) modulation of organelle biogenesis and development; 11) participation in neuropathological conditions; and 12) participation in oncogenesis.

[4526] Based on the above-described sequence similarities, the 23479 or 48120 molecules of the present invention are predicted to have similar biological activities as ubiquitin carboxyl-terminal hydrolase family members. Ubiquitin carboxyl-terminal hydrolase domains regulate the de-ubiquitination of a substrate, e.g., a protein targeted for degradation. Thus, 23479 or 48120 molecules can act as novel diagnostic targets and therapeutic agents for controlling, e.g., ubiquitination related disorders. 23479 or 48120 molecules of the invention may be useful, for example, in inducing the de-ubiquitination of ubiquitinated proteins. These proteins can therefore modulate protein degradation and the recycling of ubiquitin, as well as participate in cell signaling pathways in which ubiquitination or de-ubiquitination of a protein can alter or modify the activity of the protein. Thus, 23479 or 48120 molecules may act as novel therapeutic agents for controlling disorders associated with excessive or insufficient ubiquitination (e.g., protein degradation), and as diagnostic markers useful for indicating the presence or predisposition towards developing such disorders, or monitoring the progression or regression of a disorder.

[4527] Ubiquitination has been implicated in regulating numerous cellular processes including, for example, proliferation, differentiation, apoptosis (programmed cell death), transcription, signal-transduction, cell-cycle progression, receptor-mediated endocytosis, organelle biogenesis and others. The presence of abnormal amounts of ubiquitinated proteins in neuropathological conditions such as Alzheimer's and Pick's disease indicates that ubiquitination plays a role in various physiological disorders. Oncogenes (e.g., v-jun and v-fos) are often found to be resistant to ubiquitination in comparison to their normal cell counterparts, suggesting that a failure to degrade oncogene protein products accounts for some of their cell transformation capability.

[4528] As the 23479 and 48120 molecules of the invention are expressed in coronary and endothelial tissues, brain tissues, erythroid cells, and lung tumors (See Example 49), they can act as novel diagnostic targets and therapeutic agents for controlling disorders associated with abnormal de-ubiquitination activity and disorders associated with abnormal protein degradation in such tissues. Thus, examples of disorders that can be treated and/or diagnosed with the molecules of the invention include cellular proliferative and/or differentiative disorders (e.g., in the lung), cardiovascular disorders, brain disorders, and hematopoietic disorders.

[4529] 46689 Polypeptides

[4530] The 46689 protein contains a significant number of structural characteristics in common with members of the α/β hydrolase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[4531] The α/β hydrolase family of proteins is characterized by a common fold. In its most typical form, the fold consists of an eight stranded β-sheet surrounded by α-helices which further includes conserved catalytic residues. This enzyme family includes lipases, esterases, and proteases. Despite the variety of reactions this family is capable of mediating, the chemistry of these reactions is generally similar. Three positions in particular form a catalytic triad, contributing nucleophilic, acidic, and histidine residues. The side chains of these residues are required for nucleophilic attack on one of the atoms (typically a carbon atom) involved in the chemical bond that is to be cleaved, and thus they function during one of the most critical steps in the catalytic process. The nucleophilic residue is typically serine, although a cysteine or an aspartate residue substitutes for serine in some members of the family. The nucleophilic residue is located in a motif that has been termed the “nucleophile elbow”, a loop that makes a sharp turn following the fifth β-strand of the canonical α/β hydrolase fold. Due to the constrained arrangement of the residues that form the nucleophile elbow, the sequence of the nucleophile elbow typically includes two or three glycine residues. Following the nucleophilic residue is the acidic residue, which is located on a loop following the seventh β-strand. This acidic residue can be either an aspartic acid or a glutamic acid residue. Finally, the third residue of the catalytic triad, the histidine residue, is absolutely conserved and is located in a loop that follows the eighth β-strand of the canonical α/β hydrolase domain. Importantly, the relative order of these three catalytic residues within a given α/β hydrolase peptide sequence, nuclophile-acid-histidine, is conserved in all members of the α/β hydrolase family. Another conserved feature of the α/β hydrolase fold is an oxyanion hole, located near the end of the third β-strand, which is believed to stabilize potential covalent intermediates formed during the nucleophilic attack step in the catalytic process. The covalent intermediate then proceeds to product by general base catalysis. A detailed description of the α/β hydrolase fold can be found in Ollis et al. (1992), Protein Eng 5(3):197-211, and Nardini and Dijkstra (1999), Curr Opin Struct Biol 9(6):732-7, the contents of which are incorporated herein by reference.

[4532] A 46689 polypeptide can include an “α/β hydrolase domain” or regions homologous with an “α/β hydrolase domain”.

[4533] As used herein, the term “α/β hydrolase domain” includes an amino acid sequence of about 100 to 350 amino acid residues in length which contains a conserved catalytic triad consisting of a nucleophilic amino acid residue, an acidic amino acid residue, and a histidine residue. Preferably, an α/β hydrolase domain includes at least about 150 to 300 amino acids, more preferably about 175 to 250 amino acid residues, or about 200 to 250 amino acids. Based on sequence alignments, the presence and extent of an α/β hydrolase domain in a test protein sequence can be determined. One description of an α/β hydrolase domain (HMM) has been assigned the PFAM Accession Number PF00561 (http://pfam.wustl.edu). An alignment of human 46689 (about amino acids 186 to 419 of SEQ ID NO:81) with the PFAM α/β hydrolase domain consensus amino acid sequence (SEQ ID NO:87) derived from a hidden Markov model is depicted in FIG. 38. The alignment demonstrates the presence of a catalytic acid residue, located at about amino acid 360 of SEQ ID NO:81, and a catalytic histidine residue, located about amino acid residue 391 of SEQ ID NO:81. The alignment also suggests that the serine residue located at about amino acid residue 238 of SEQ ID NO:81 may be the catalytic nucleophile residue of human 46689.

[4534] A consensus sequence for α/β hydrolase domain-containing protein families is also provided, e.g., by ProDom family PD007763 (ProDomain Release 2001.1; http://www.toulouse.inra.fr/prodom.html). An alignment of a large portion of human 46689 (about amino acid residues 97 to 424 of SEQ ID NO:81) with the consensus amino acid sequence of an α/β hydrolase-containing family of proteins (SEQ ID NO:88) derived from recursive PSI-BLAST searches, is depicted in FIG. 39. This alignment also reveals the presence of the conserved catalytic acid and catalytic histidine residues of human 46689 polypeptides.

[4535] In a preferred embodiment, a 46689 polypeptide or protein has an “α/β hydrolase domain” or a region which includes a conserved catalytic triad consisting of a nucleophilic residue, an acid residue, and a histidine residue, wherein the nucleophilic residue is separated from the catalytic acid residue by about 110 to 145 amino acid residues, more preferably about 115 to 130 amino acid residues, or about 122 amino acid residues.

[4536] In another preferred embodiment, a 46689 polypeptide or protein has an “α/β hydrolase domain” or a region which includes a conserved catalytic triad consisting of a nucleophilic residue, an acid residue, and a histidine residue, wherein the catalytic acid residue is separated from the catalytic histidine residue by about 15 to 40 amino acid residues, more preferably about 25 to 35 amino acid residues, or about 31 amino acid residues.

[4537] In yet another preferred embodiment, a 46689 polypeptide or protein has an “α/β hydrolase domain” or a region which includes at least about 100 to 350, more preferably about 175 to 300, or 200 to 250 amino acid residues and has at least about 70% 80% 90% 95%, 98%, 99%, or 100% homology with an “α/β hydrolase domain,” e.g., the α/β hydrolase domain of human 46689 (e.g., residues 186 to 419 of SEQ ID NO:81).

[4538] To identify the presence of an “α/β hydrolase” domain in a 46689 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against the PFAM database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the PFAM database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of an “α/β hydrolase” domain in the amino acid sequence of human 46689 at about residues 186 to 419 of SEQ ID NO:81 (see FIG. 38).

[4539] A 46689 molecule can further include an amino acid sequence homologous to “ProDom PD007763 domain.” Members of the family of proteins containing this domain are uncharacterized proteins that share conserved regions, including a hydrolase signature. Yeast protein YMR210W and human protein PHPS1-2 are members of this family.

[4540] As used herein, the term “ProDom PD007763 domain” includes an amino acid sequence of about 250 to 450 amino acid residues in length having a bit score for the alignment of the sequence with ProDom PD007763 of at least 50. Preferably, a ProDom PD007763 domain includes at least 250 to 450 amino acids, more preferably about 275 to 400 amino acid residues, or about 300 to 350 amino acids, and has a bit score for the alignment of the sequence with ProDom PD007763 of at least 75, 85, preferably 95 or more. An alignment of the ProDom PD007763 domain of human 46689 (amino acids 97 to 424 of SEQ ID NO:81) with a consensus amino acid sequence derived from a hidden Markov model is depicted in FIG. 39.

[4541] In a preferred embodiment, a 46689 polypeptide or protein has a “ProDom PD007763 domain” or a region which includes at least about 250 to 450, more preferably about 275 to 400, or 300 to 350 amino acid residues and has at least about 70% 80% 90% 95%, 99%, or 100% homology with a “ProDom PD007763 domain”, e.g., the ProDom PD007763 domain of human 46689 (e.g., residues 97 to 424 of SEQ ID NO:81).

[4542] To identify the presence of a “ProDom PD007763 domain” in a 46689 protein sequence, and make the determination that a polypeptide or protein of interest contains such a domain, the amino acid sequence of the protein can be searched against the ProDom database (Corpet et al. (1999), Nucl. Acids Res. 27:263-267) The ProDom protein domain database consists of an automatic compilation of homologous domains. Current versions of ProDom are built using recursive PSI-BLAST searches (Altschul S F et al. (1997) Nucleic Acids Res. 25:3389-3402; Gouzy et al. (1999) Computers and Chemistry 23:333-340.) of the SWISS-PROT 38 and TREMBL protein databases. The database automatically generates a consensus sequence for each domain. A BLAST search was performed against the ProDom database resulting in the identification of an “α/β hydrolase” domain within the amino acid sequence of human 46689 that includes residues 97 to 424 of SEQ ID NO:81 (FIG. 39).

[4543] A 46689 protein can further include at least one transmembrane domain. As used herein, the term “transmembrane domain” includes an amino acid sequence of at least about 10 amino acid residues in length that spans the plasma membrane. More preferably, a transmembrane domain includes about at least 15 amino acid residues and spans the plasma membrane. Transmembrane domains are rich in hydrophobic residues, and typically have an alpha-helical structure. In a preferred embodiment, at least 50%, 60%, 70%, 80%, 90%, 95% or more of the amino acids of a transmembrane domain are hydrophobic, e.g., leucines, isoleucines, tyrosines, or tryptophans. Transmembrane domains are described in, for example, Zagotta W. N. et al., (1996) Annual Rev. Neurosci. 19: 235-263, the contents of which are incorporated herein by reference. Amino acid residues 150 to 167 of the 46689 protein (SEQ ID NO:81) are predicted to be a transmembrane domain (see FIG. 37). Accordingly, 56294 proteins having at least 50-60% homology, preferably about 60-70%, more preferably about 70-80%, or about 80-90% homology with at least one transmembrane domain of human 46689 are within the scope of the invention.

[4544] In a preferred embodiment, 46689 protein has a “transmembrane domain” or a region which includes at least about 10, more preferably at least about 15 amino acid residues and has at least about 70% 80% 90% 95%, 99%, or 100% homology with a “transmembrane domain,” e.g., the transmembrane domain of 46689 protein (e.g., residues 150 to 167 of SEQ ID NO:81).

[4545] A 46689 protein can further include a signal sequence. As used herein, a “signal peptide” or “signal sequence” refers to a peptide of about 15 to 60, preferably about 20 to 40, more preferably, 27 amino acid residues in length which occurs at the N-terminus of secretory and integral membrane proteins and which contains a majority of hydrophobic amino acid residues. For example, a signal sequence contains at least about 15 to 60, preferably about 20 to 40, more preferably, 27 amino acid residues, and has at least about 40-70%, preferably about 50-65%, and more preferably about 55-60% hydrophobic amino acid residues (e.g., alanine, valine, leucine, isoleucine, phenylalanine, tyrosine, tryptophan, or proline). Such a “signal sequence”, also referred to in the art as a “signal peptide”, serves to direct a protein containing such a sequence to a lipid bilayer. For example, in one embodiment, a 46689 protein contains a signal sequence of about 26 amino acids. The “signal sequence” is cleaved during processing of the mature protein. The mature 46689 protein corresponds to amino acids 27 to 468 of SEQ ID NO:81.

[4546] A 46689 family member can include at least one α/β hydrolase domain and/or at least one ProDom PD007763 domain. Furthermore, a 46689 family member can include at least one catalytic acid residue; at least one catalytic histidine residue; at least one transmembrane domain; at least one signal peptide; at least one, two, three, preferably four predicted protein kinase C phosphorylation sites (PS00005); at least one, two, three, preferably four predicted casein kinase II phosphorylation sites (PS00006); at least one predicted cAMP- and cGMP-dependent protein kinase phosphorylation site (PS00004); at least one, preferably two predicted amidation sites (PS00009); and at least one, two, three, four, preferably five predicted N-myristylation sites (PS00008).

[4547] As the 46689 polypeptides of the invention may modulate 46689-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 46689-mediated or related disorders, as described below.

[4548] As used herein, a “46689 activity”, “biological activity of 46689” or “functional activity of 46689”, refers to an activity exerted by a 46689 protein, polypeptide or nucleic acid molecule. For example, a 46689 activity can be an activity exerted by 46689 in a physiological milieu on, e.g., a 46689-responsive cell or on a 46689 substrate, e.g., a protein, lipid, or small molecule substrate. A 46689 activity can be determined in vivo or in vitro. In one embodiment, a 46689 activity is a direct activity, such as an association with a 46689 target molecule. A “target molecule” or “binding partner” is a molecule with which a 46689 protein binds or interacts in nature. In an exemplary embodiment, 46689 hydrolyzes a substrate, e.g., a protein, lipid, or small molecule (e.g., metabolite, signaling molecule, toxin, or carcinogen) substrate.

[4549] A 46689 activity can also be an indirect activity, e.g., modulation of a cellular signaling activity mediated by a 46689 substrate or product. The features of the 46689 molecules of the present invention can provide similar biological activities as α/β hydrolase family members. For example, the 46689 proteins of the present invention can have one or more of the following activities: (1) hydrolysis of lipid substrates; (2) hydrolysis of cholesterol; (3) hydrolysis of epoxides and other toxic chemicals; (4) hydrolysis of acetylcholine and other neurotransmitters; (5) protease activity; (6) hydrolysis of carboxylesters; or (7) thioesterase activity. As a result, the 46689 protein may have a critical function in one or more of the following physiological processes: (1) metabolite regulation and degradation; (2) drug metabolism; (3) toxin or carcinogen removal and neutralization; (4) toxin or carcinogen production; (4) cellular proliferation or differentiation; and (5) neuronal function.

[4550] The 46689 polypeptide may be involved in disorders of metabolic imbalance, including obesity, anorexia nervosa, cachexia, lipid disorders, cholesterol imbalance, and diabetes. For example, many α/β hydrolase family members have lipase activity and cholesterol esterase activity and the human hormone sensitive lipase is responsible for metabolizing fat stored in adipocytes. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, al-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease).

[4551] The 46689 polypeptide may be involved in disorders of toxin (e.g., carcinogen) metabolism and/or removal. For example, disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol.

[4552] 46689 molecules may also have a critical role in removing xenobiotic epoxides and other toxins from the body. Furthermore, it may contribute to the metabolism of drugs and other pharmaceuticals. The study of polymorphisms in the 46689 gene should provide a useful resource for pharmacogenomic (see below) analysis of drug responses. Additionally, variations in 46689 may contribute to population differences in sensitivity to environmental toxins.

[4553] In addition, as the 46689 molecules are expressed in bone marrow tissue, lung tissue, thymus tissue, glial cells, brain tissue, and kidney tissue, as well as in lung, brain, ovary, and breast tumors, they can act as novel diagnostic targets and therapeutic agents for controlling disorders associated with abnormal cellular metabolism in those tissues.

[4554] Thus, examples of disorders that can be treated and/or diagnosed with the molecules of the invention include cellular proliferative and/or differentiative disorders (e.g., in the lung, brain, ovary, or breast), hematopoietic disorders, neural disorders (e.g., brain disorders), liver disorders, and cardiovascular disorders.

[4555] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[4556] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth. Examples of such cells include cells having an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[4557] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[4558] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[4559] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[4560] Examples of cellular proliferative and/or differentiative disorders of the colon include, but are not limited to, non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors.

[4561] Examples of cellular proliferative and/or differentiative disorders of the liver include, but are not limited to, nodular hyperplasias, adenomas, and malignant tumors, including primary carcinoma of the liver and metastatic tumors.

[4562] Examples of cellular proliferative and/or differentiative disorders of the breast include, but are not limited to, proliferative breast disease including, e.g., epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors, e.g., stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[4563] Examples of cellular proliferative and/or differentiative disorders of the lung include, but are not limited to, bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[4564] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin. A hematopoietic neoplastic disorder can arise from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[4565] As used herein, heart disorders, or “cardiovascular disease” or a “cardiovascular disorder” includes a disease or disorder which affects the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. A cardiovascular disorder includes, but is not limited to disorders such as arteriosclerosis, atherosclerosis, cardiac hypertrophy, ischemia reperfusion injury, restenosis, arterial inflammation, vascular wall remodeling, ventricular remodeling, rapid ventricular pacing, coronary microembolism, tachycardia, bradycardia, pressure overload, aortic bending, coronary artery ligation, vascular heart disease, valvular disease, including but not limited to, valvular degeneration caused by calcification, rheumatic heart disease, endocarditis, or complications of artificial valves; atrial fibrillation, long-QT syndrome, congestive heart failure, sinus node dysfunction, angina, heart failure, hypertension, atrial fibrillation, atrial flutter, pericardial disease, including but not limited to, pericardial effusion and pericarditis; cardiomyopathies, e.g., dilated cardiomyopathy or idiopathic cardiomyopathy, myocardial infarction, coronary artery disease, coronary artery spasm, ischemic disease, arrhythmia, sudden cardiac death, and cardiovascular developmental disorders (e.g., arteriovenous malformations, arteriovenous fistulae, raynaud's syndrome, neurogenic thoracic outlet syndrome, causalgia/reflex sympathetic dystrophy, hemangioma, aneurysm, cavernous angioma, aortic valve stenosis, atrial septal defects, atrioventricular canal, coarctation of the aorta, ebsteins anomaly, hypoplastic left heart syndrome, interruption of the aortic arch, mitral valve prolapse, ductus arteriosus, patent foramen ovale, partial anomalous pulmonary venous return, pulmonary atresia with ventricular septal defect, pulmonary atresia without ventricular septal defect, persistance of the fetal circulation, pulmonary valve stenosis, single ventricle, total anomalous pulmonary venous return, transposition of the great vessels, tricuspid atresia, truncus arteriosus, ventricular septal defects). A cardiovasular disease or disorder also can include an endothelial cell disorder.

[4566] As used herein, an “endothelial cell disorder” includes a disorder characterized by aberrant, unregulated, or unwanted endothelial cell activity, e.g., proliferation, migration, angiogenesis, or vascularization; or aberrant expression of cell surface adhesion molecules or genes associated with angiogenesis, e.g., TIE-2, FLT and FLK. Endothelial cell disorders include tumorigenesis, tumor metastasis, psoriasis, diabetic retinopathy, endometriosis, Grave's disease, ischemic disease (e.g., atherosclerosis), and chronic inflammatory diseases (e.g., rheumatoid arthritis).

[4567] Examples of hematopoieitic disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[4568] Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B1) deficiency and vitamin B12 deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[4569] Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[4570] The 23479, 48120, or 46689 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81 thereof are collectively referred to as “polypeptides or proteins of the invention” or “23479, 48120, or 46689 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “23479, 48120, or 46689 nucleic acids.” 23479, 48120, or 46689 molecules refer to 23479, 48120, or 46689 nucleic acids, polypeptides, and antibodies.

[4571] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[4572] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules that are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[4573] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[4574] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82 corresponds to a naturally-occurring nucleic acid molecule.

[4575] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein.

[4576] As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules that include at least an open reading frame encoding a 23479, 48120, or 46689 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 23479, 48120, or 46689 protein or derivative thereof.

[4577] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 23479, 48120, or 46689 protein is at least 10% pure. In a preferred embodiment, the preparation of 23479, 48120, or 46689 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-23479, 48120, or 46689 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-23479, 48120, or 46689 chemicals. When the 23479, 48120, or 46689 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[4578] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 23479, 48120, or 46689 without abolishing or substantially altering a 23479, 48120, or 46689 activity. Preferably the alteration does not substantially alter the 23479, 48120, or 46689 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 23479, 48120, or 46689, results in abolishing a 23479, 48120, or 46689 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 23479, 48120, or 46689 are predicted to be particularly unamenable to alteration.

[4579] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 23479, 48120, or 46689 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 23479, 48120, or 46689 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 23479, 48120, or 46689 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[4580] As used herein, a “biologically active portion” of a 23479, 48120, or 46689 protein includes a fragment of a 23479, 48120, or 46689 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 23479, 48120, or 46689 molecule and a non-23479, 48120, or 46689 molecule or between a first 23479, 48120, or 46689 molecule and a second 23479, 48120, or 46689 molecule (e.g., a dimerization interaction). Biologically active portions of a 23479, 48120, or 46689 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 23479, 48120, or 46689 protein, e.g., the amino acid sequence shown in SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81, which include less amino acids than the full length 23479, 48120, or 46689 proteins, and exhibit at least one activity of a 23479, 48120, or 46689 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 23479, 48120, or 46689 protein, e.g., hydrolysis of a substrate molecule, e.g., a protein (e.g., a ubiquitinated protein or poly-ubiquitin), lipid, or small molecule (e.g., metabolite, signaling molecule, toxin, or carcinogen) substrate. A biologically active portion of a 23479, 48120, or 46689 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 23479, 48120, or 46689 protein can be used as targets for developing agents which modulate a 23479, 48120, or 46689 mediated activity, e.g., hydrolysis of a substrate molecule, e.g., a protein (e.g., a ubiquitinated protein or poly-ubiquitin), lipid, or small molecule (e.g., metabolite, signaling molecule, toxin, or carcinogen) substrate.

[4581] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[4582] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[4583] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[4584] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[4585] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[4586] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 23479, 48120, or 46689 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 23479, 48120, or 46689 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[4587] Particularly preferred 23479, 48120, or 46689 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81 are termed substantially identical.

[4588] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82 are termed substantially identical.

[4589] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[4590] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[4591] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[4592] Various aspects of the invention are described in further detail below.

[4593] Isolated Nucleic Acid Molecules of 23479, 48120, and 46689

[4594] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 23479, 48120, or 46689 polypeptide described herein, e.g., a full-length 23479, 48120, or 46689 protein or a fragment thereof, e.g., a biologically active portion of 23479, 48120, or 46689 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 23479, 48120, or 46689 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[4595] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:74, SEQ ID NO:77, or SEQ ID NO:80, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 23479, 48120, or 46689 protein (i.e., “the coding region” of SEQ ID NO:74, SEQ ID NO:77, or SEQ ID NO:80, as shown in SEQ ID NO:76, SEQ ID NO:79, or SEQ ID NO:82, respectively), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:74, SEQ ID NO:77, or SEQ ID NO:80 (e.g., SEQ ID NO:76, SEQ ID NO:79, or SEQ ID NO:82, respectively) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of a 46689 protein from about amino acid 27 to 468 of SEQ ID NO:75.

[4596] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82, thereby forming a stable duplex.

[4597] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about: 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82, or a portion, preferably of the same length, of any of these nucleotide sequences.

[4598] 23479 and 48120 Nucleic Acid Fragments

[4599] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, or SEQ ID NO:79. For example, such a nucleic acid molecule can include a fragment that can be used as a probe or primer or a fragment encoding a portion of a 23479 or 48120 protein, e.g., an immunogenic or biologically active portion of a 23479 or 48120 protein. A fragment can comprise those nucleotides of SEQ ID NO:74 or SEQ ID NO:77 which encode a UCH-1, UCH-2, UBA, or UIM domain of human 23479 or 48120. The nucleotide sequence determined from the cloning of the 23479 or 48120 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 23479 or 48120 family members, or fragments thereof, as well as 23479 or 48120 homologues, or fragments thereof, from other species.

[4600] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment that includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 20 amino acids in length. Preferably, fragments are at least 50, 75, 100, 200, 300, 400, 500, 600, 700, 800, or 900 amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[4601] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 23479 or 48120 nucleic acid fragment can include a sequence corresponding to a UCH-1 domain and a UCH-2 domain. A 48120 nucleic acid fragment can further include a sequence corresponding to a UBA domain and a UIM domain.

[4602] 23479 or 48120 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, or SEQ ID NO:79, or of a naturally occurring allelic variant or mutant of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, or SEQ ID NO:79. Preferably, an oligonucleotide is less than about 200, 150, 120, or 100 nucleotides in length.

[4603] In one embodiment, the probe or primer is attached to a solid support, e.g., a solid support described herein.

[4604] One exemplary kit of primers includes a forward primer that anneals to the coding strand and a reverse primer that anneals to the non-coding strand. The forward primer can anneal to the start codon, e.g., the nucleic acid sequence encoding amino acid residue 1 of SEQ ID NO:75 or SEQ ID NO:78. The reverse primer can anneal to the ultimate codon, e.g., the codon immediately before the stop codon, e.g., the codon encoding amino acid residue 934 of SEQ ID NO:75 or amino acid residue 1139 of SEQ ID NO:78. In a preferred embodiment, the annealing temperatures of the forward and reverse primers differ by no more than 5, 4, 3, or 2° C.

[4605] In a preferred embodiment the nucleic acid is a probe which is at least 10, 12, 15, 18, 20 and less than 200, more preferably less than 100, or less than 50, nucleotides in length. It should be identical, or differ by 1, or 2, or less than 5 or 10 nucleotides, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[4606] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes: a UCH-1 domain from about amino acids about amino acid 296-327 of SEQ ID NO:75 or amino acid 162-193 of SEQ ID NO:78; a UCH-2 domain from about amino acid 546-640 of SEQ ID NO:75 or amino acid 580-649 of SEQ ID NO:78; a UBA domain from about amino acid 20-61 of SEQ ID NO:78; or a UIM domain from about amino acid 96-113 of SEQ ID NO:78.

[4607] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 23479 or 48120 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a UCH-1 domain from about amino acids about amino acid 296-327 of SEQ ID NO:75 or amino acid 162-193 of SEQ ID NO:78; a UCH-2 domain from about amino acid 546-640 of SEQ ID NO:75 or amino acid 580-649 of SEQ ID NO:78; a UBA domain from about amino acid 20-61 of SEQ ID NO:78; and a UIM domain from about amino acid 96-113 of SEQ ID NO:78.

[4608] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[4609] A nucleic acid fragment encoding a “biologically active portion of a 23479 or 48120 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, or SEQ ID NO:79, which encodes a polypeptide having a 23479 or 48120 biological activity (e.g., the biological activities of the 23479 or 48120 proteins are described herein), expressing the encoded portion of the 23479 or 48120 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 23479 or 48120 protein. For example, a nucleic acid fragment encoding a biologically active portion of 23479 or 48120 includes a UCH-1 domain from about amino acids about amino acid 296-327 of SEQ ID NO:75 or amino acid 162-193 of SEQ ID NO:78, a UCH-2 domain from about amino acid 546-640 of SEQ ID NO:75 or amino acid 580-649 of SEQ ID NO:78, a UBA domain from about amino acid 20-61 of SEQ ID NO:78, or a UIM domain from about amino acid 96-113 of SEQ ID NO:78. A nucleic acid fragment encoding a biologically active portion of a 23479 or 48120 polypeptide, may comprise a nucleotide sequence which is greater than 300 or more nucleotides in length.

[4610] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600, 2700, 2800, 2900, 3000, 3100, 3200, 3300, 3400, 3500, 3600, 3700, 3800, 3900, 4000, 4100, 4200, 4300, 4400, 4500, 4600, 4700, 4800, or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, or SEQ ID NO:79.

[4611] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from a sequence described in WO01/55301, WO 01/57058, or WO 01/38543, or Genbank™ accession numbers AB018272 or AK001193. Differences can include differing in length or sequence identity. For example, a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:74 or SEQ ID NO:76 located outside the region of nucleotides 2314-3363, 1788-2315, 1865-2831, 2660-3206, 1723-2649, 441-1046, 3012-3206, 2493-2726, or 1723-2372 of SEQ ID NO:74; include one or more nucleotides from SEQ ID NO:77 or SEQ ID NO:79 located outside the region of nucleotides 2722-4329, 2818-3713, 1366-2233, 2923-3535, 1366-1829, 2766-3169, 2766-3962, or 3958-4653 of SEQ ID NO:77; not include all of the nucleotides of a sequence of WO01/55301 or WO 01/57058, or Genbank™ accession numbers AB018272 or AK001193, e.g., can be one or more nucleotides shorter (at one or both ends) than a sequence of WO01/55301 or WO 01/57058, or Genbank™ accession numbers AB018272 or AK001193; or can differ by one or more nucleotides in the region of overlap.

[4612] 46689 Nucleic Acid Fragments

[4613] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:80 or 82. For example, such a nucleic acid molecule can include a fragment that can be used as a probe or primer or a fragment encoding a portion of a 46689 protein, e.g., an immunogenic or biologically active portion of a 46689 protein. A fragment can comprise those nucleotides of SEQ ID NO:80 which encode an α/β hydrolase domain of human 46689. The nucleotide sequence determined from the cloning of the 46689 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 46689 family members, or fragments thereof, as well as 46689 homologues, or fragments thereof, from other species.

[4614] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment that includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 50, 100, 124, 136, 150, 185, 190, 200, 238, or more amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[4615] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 46689 nucleic acid fragment can include a sequence corresponding to an α/β hydrolase domain, e.g., about nucleotides 670 to 1371 of SEQ ID NO:80, a region that includes a transmembrane domain, e.g., about nucleotides 115 to 669 of SEQ ID NO:80, or a region that includes both an α/β hydrolase domain and a transmembrane domain, e.g., about nucleotides 562 to 1371 of SEQ ID NO:80.

[4616] 46689 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:80 or SEQ ID NO:82, or of a naturally occurring allelic variant or mutant of SEQ ID NO:80 or SEQ ID NO:82. Preferably, an oligonucleotide is less than about 200, 150, 120, or 100 nucleotides in length.

[4617] In one embodiment, the probe or primer is attached to a solid support, e.g., a solid support described herein.

[4618] One exemplary kit of primers includes a forward primer that anneals to the coding strand and a reverse primer that anneals to the non-coding strand. The forward primer can anneal to the start codon, e.g., the nucleic acid sequence encoding amino acid residue 1 of SEQ ID NO:81. The reverse primer can anneal to the ultimate codon, e.g., the codon immediately before the stop codon, e.g., the codon encoding amino acid residue 468 of SEQ ID NO:81. In a preferred embodiment, the annealing temperatures of the forward and reverse primers differ by no more than 5, 4, 3, or 2° C.

[4619] In a preferred embodiment the nucleic acid is a probe which is at least 10, 12, 15, 18, 20 and less than 200, more preferably less than 100, or less than 50, nucleotides in length. It should be identical, or differ by 1, or 2, or less than 5 or 10 nucleotides, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[4620] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes: an α/β hydrolase domain, e.g., about amino acid residues 186 to 419 of SEQ ID NO:81; a region that includes a transmembrane domain, e.g., about amino acid residues 1 to 185 or 27 to 185 of SEQ ID NO:81; or a region that includes both an α/β hydrolase domain and a transmembrane domain, e.g., about amino acid residues 150 to 419 of SEQ ID NO:81.

[4621] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 46689 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: an α/β hydrolase domain, e.g., about amino acid residues 186 to 419 of SEQ ID NO:81; a region that includes a transmembrane domain, e.g., about amino acid residues 1 to 185 or 27 to 185 of SEQ ID NO:81; or a region that includes both an α/β hydrolase domain and a transmembrane domain, e.g., about amino acid residues 150 to 419 of SEQ ID NO:81.

[4622] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[4623] A nucleic acid fragment encoding a “biologically active portion of a 46689 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:80 or 82, which encodes a polypeptide having a 46689 biological activity (e.g., the biological activities of the 46689 proteins are described herein), expressing the encoded portion of the 46689 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 46689 protein. For example, a nucleic acid fragment encoding a biologically active portion of 46689 includes an α/β hydrolase domain, e.g., amino acid residues about 186 to 419 of SEQ ID NO:81. A nucleic acid fragment encoding a biologically active portion of a 46689 polypeptide, may comprise a nucleotide sequence which is greater than 712 or more nucleotides in length.

[4624] In preferred embodiments, a nucleic acid includes a nucleotide sequence that is about 300, 372, 408, 500, 555, 561, 600, 700, 712, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2050, or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:80, or SEQ ID NO:82.

[4625] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of SEQ ID NO: 7258 of WO 01/57188 or SEQ ID NO: 790 or WO 00/52165. Differences can include differing in length or sequence identity. For example, a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:80 or SEQ ID NO:82 located outside the region of nucleotides 246 to 799, 808 to 1519, 743 to 1113, 1115 to 1521, or 1523 to 2082; not include all of the nucleotides of SEQ ID NO: 7258 of WO 01/57188 or SEQ ID NO: 790 or WO 00/52165, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of SEQ ID NO: 7258 of WO 01/57188 or SEQ ID NO: 790 or WO 00/52165; or can differ by one or more nucleotides in the region of overlap.

[4626] 23479, 48120, or 46689 Nucleic Acid Variants

[4627] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 23479, 48120, or 46689 proteins as those encoded by the nucleotide sequence disclosed herein). In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81. If alignment is needed for this comparison the sequences should be aligned for maximum homology. The encoded protein can differ by no more than 5, 4, 3, 2, or 1 amino acid. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[4628] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[4629] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[4630] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. The nucleic acid can differ by no more than 5, 4, 3, 2, or 1 nucleotide. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[4631] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO:75, SEQ ID NO:78, SEQ ID NO:81 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 23479, 48120, or 46689 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 23479, 48120, or 46689 gene.

[4632] Preferred variants include those that are correlated with hydrolase activity, e.g., the hydrolysis of a substrate molecule, e.g., a protein (e.g., a ubiquitinated protein or poly-ubiquitin), lipid, or small molecule (e.g., metabolite, signaling molecule, toxin, or carcinogen) substrate.

[4633] Allelic variants of 23479, 48120, or 46689, e.g., human 23479, 48120, or 46689, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 23479, 48120, or 46689 protein within a population that maintain the ability to bind and hydrolyze substrate molecules, e.g., protein (e.g., a ubiquitinated protein or poly-ubiquitin), lipid, or small molecule (e.g., metabolite, signaling molecule, toxin, or carcinogen) substrates. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 23479, 48120, or 46689, e.g., human 23479, 48120, or 46689, protein within a population that do not have the ability to bind and hydrolyze substrate molecules, e.g., protein (e.g., a ubiquitinated protein or poly-ubiquitin), lipid, or small molecule (e.g., metabolite, signaling molecule, toxin, or carcinogen) substrates. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[4634] Moreover, nucleic acid molecules encoding other 23479, 48120, or 46689 family members and, thus, which have a nucleotide sequence which differs from the 23479, 48120, or 46689 sequences of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82 are intended to be within the scope of the invention.

[4635] Antisense Nucleic Acid Molecules, Ribozymes and Modified 23479, 48120, or 46689 Nucleic Acid Molecules

[4636] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 23479, 48120, or 46689. An “antisense” nucleic acid can include a nucleotide sequence that is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 23479, 48120, or 46689 coding strand, or to only a portion thereof (e.g., the coding region of human 23479, 48120, or 46689 corresponding to SEQ ID NO:76, SEQ ID NO:79, or SEQ ID NO:82, respectively). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 23479, 48120, or 46689 (e.g., the 5′ and 3′untranslated regions).

[4637] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 23479, 48120, or 46689 mRNA, but more preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding region of 23479, 48120, or 46689 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 23479, 48120, or 46689 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[4638] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[4639] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 23479, 48120, or 46689 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[4640] In yet another embodiment, the antisense nucleic acid molecule of the invention is an α-anomeric nucleic acid molecule. An α-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual β-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[4641] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 23479, 48120, or 46689-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 23479, 48120, or 46689 cDNA disclosed herein (i.e., SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 23479, 48120, or 46689-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 23479, 48120, or 46689 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[4642] 23479, 48120, or 46689 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 23479, 48120, or 46689 (e.g., the 23479, 48120, or 46689 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 23479, 48120, or 46689 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[4643] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or colorimetric.

[4644] A 23479, 48120, or 46689 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[4645] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[4646] PNAs of 23479, 48120, or 46689 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 23479, 48120, or 46689 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[4647] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[4648] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 23479, 48120, or 46689 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 23479, 48120, or 46689 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[4649] Isolated 23479 or 48120 Polypeptides

[4650] In another aspect, the invention features an isolated 23479 or 48120 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-23479 or 48120 antibodies. 23479 or 48120 protein can be isolated from cells or tissue sources using standard protein purification techniques. 23479 or 48120 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[4651] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[4652] In a preferred embodiment, a 23479 or 48120 polypeptide has one or more of the following characteristics:

[4653] (i) it has the ability to cleave a bond between ubiquitin and a substrate, e.g., a protein targeted for degradation by ubiquitination;

[4654] (ii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of a 23479 or 48120 polypeptide, e.g., a polypeptide of SEQ ID NO:75 or SEQ ID NO:78;

[4655] (iii) it has an overall sequence similarity of at least 60%, more preferably at least 70%, 80%, 90%, 95%, 98%, 99%, or more with a polypeptide of SEQ ID NO:75 or SEQ ID NO:78;

[4656] (iv) it has a UCH-1 domain which preferably has an overall sequence similarity of at least about 70%, 80%, 90% or 95% with amino acid residues about 296-327 of SEQ ID NO:75 or amino acid 162-193 of SEQ ID NO:78;

[4657] (v) it has a UCH-2 domain, or region which has an overall sequence similarity of at least about 70%, 80%, 90% or 95% with amino acid residues about 546-640 of SEQ ID NO:75 or amino acid 580-649 of SEQ ID NO:78;

[4658] (vi) it has a UBA domain, or region which has an overall sequence similarity of at least about 70%, 80%, 90% or 95% with amino acid residues about 20-61 of SEQ ID NO:78;

[4659] (vii) it has a UIM domain, or region which has an overall sequence similarity of at least about 70%, 80%, 90% or 95% with amino acid residues about 20-61 of SEQ ID NO:78; and

[4660] (viii) it has at least one, two, three, four, five, or six predicted N-glycosylation sites (PS00001);

[4661] (ix) it has at least one predicted cAMP and cGMP-dependent protein kinase phosphorylation site (PS00004);

[4662] (x) it has at least one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, 13, 14, or 15 predicted protein kinase C phosphorylation sites (PS00005);

[4663] (xi) it has at least one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or even 33 predicted casein kinase II phosphorylation sites (PS00006);

[4664] (xii) it has at least one, two, or three predicted tyrosine kinase phosphorylation sites (PS00007);

[4665] (xiii) it has at least one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, or even 13 predicted N-myristoylation sites (PS00008);

[4666] (xiv) it has at least one amidation site (PS00009);

[4667] (xv) it has at least one carbamoyl-phosphate synthase subdomain signature 2 (PS00867);

[4668] (xvi) at least one ubiquitin carboxyl-terminal hydrolase family 2 signature 2 (PS00973);

[4669] (xvii) it has at least one predicted peroxisomal targeting signal; and

[4670] (xviii) it has at least one coiled coil domain.

[4671] In a preferred embodiment the 23479 or 48120 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:75 or SEQ ID NO:78. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:75 or SEQ ID NO:78 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:75 or SEQ ID NO:78. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the UCH-1, UCH-2, UBA, or UIM domains. In another preferred embodiment one or more differences are in the UCH-1, UCH-2, UBA, or UIM domains.

[4672] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 23479 or 48120 proteins differ in amino acid sequence from SEQ ID NO:75 or SEQ ID NO:78, yet retain biological activity.

[4673] In one embodiment, the protein includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:75 or SEQ ID NO:78.

[4674] A 23479 protein or fragment is provided which varies from the sequence of SEQ ID NO:75 in regions defined by amino acids about 1-295, 328-545, or 641-934 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:75 in regions defined by amino acids about 296-327 or 546-640. A 48120 protein or fragment is provided which varies from the sequence of SEQ ID NO:78 in regions defined by amino acids about 1-19, 62-95, 114-161, 194-579, or 650-1139 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:78 in regions defined by amino acids about 20-61, 96-113, 162-193, or 580-649. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[4675] In one embodiment, a biologically active portion of a 23479 or 48120 protein includes a UCH-1, UCH-2, UBA, or UIM domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 23479 or 48120 protein.

[4676] In a preferred embodiment, the 23479 or 48120 protein has an amino acid sequence shown in SEQ ID NO:75 or SEQ ID NO:78. In other embodiments, the 23479 or 48120 protein is substantially identical to SEQ ID NO:75 or SEQ ID NO:78. In yet another embodiment, the 23479 or 48120 protein is substantially identical to SEQ ID NO:75 or SEQ ID NO:78 and retains the functional activity of the protein of SEQ ID NO:75 or SEQ ID NO:78, as described in detail in the subsections above.

[4677] In a preferred embodiment, a fragment differs by at least 1, 2, 3, 10, 20, or more amino acid residues encoded by a sequence present in WO01/55301, WO 01/57058, WO 01/57272, or WO 01/38543, or Genbank™ accession number BAA34449. Differences can include differing in length or sequence identity. For example, a fragment can: include one or more amino acid residues from SEQ ID NO:75 or SEQ ID NO:78 outside the region of amino acid residues 488-843, 638-934, 753-934, or 440-510 of SEQ ID NO:75 or 428-716, 880-1139, 912-1139, 428-582, 1017-1081 of SEQ ID NO:78; not include all of the amino acid residues of a sequence present in WO01/55301, WO 01/57058, WO 01/57272, or WO 01/38543, or Genbank™ accession number BAA344449, e.g., can be one or more amino acid residues shorter (at one or both ends) than a sequence present in WO01/55301, WO 01/57058, WO 01/57272, or WO 01/38543, or Genbank accession number BAA344449; or can differ by one or more amino acid residues in the region of overlap.

[4678] Isolated 46689 Polypeptides

[4679] In another aspect, the invention features, an isolated 46689 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-46689 antibodies. 46689 protein can be isolated from cells or tissue sources using standard protein purification techniques. 46689 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[4680] Polypeptides of the invention include those that arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[4681] In a preferred embodiment, a 46689 polypeptide has one or more of the following characteristics:

[4682] (i) it has the ability to bind to and hydrolyze substrate molecules, e.g., protein, lipid, or small molecule (e.g., metabolite, signaling molecule, toxin, or carcinogen) substrates;

[4683] (ii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of a 46689 polypeptide, e.g., a polypeptide of SEQ ID NO:81;

[4684] (iii) it has an overall sequence similarity of at least 60%, more preferably at least 70%, 80%, 90%, 95%, 98%, 99%, or more with a polypeptide a of SEQ ID NO:81;

[4685] (iv) it can be found in tumors;

[4686] (v) it has an α/β hydrolase domain which is preferably about 70%, 80%, 90%, 95%, 98%, 99%, or more identical with amino acid residues about 186 to 419 of SEQ ID NO:81;

[4687] (vi) it has a catalytic acid residue;

[4688] (vii) it has a catalytic histidine residue;

[4689] (viii) it has a transmembrane domain, or a region which is about 70%, 80%, 90%, 95%, 98%, 99%, or more identical with amino acid residues about 150 to 167 or SEQ ID NO:81;

[4690] (ix) it has a signal peptide, or a region which is about 70%, 80%, 90%, 95%, 98%, 99%, or more identical with amino acid residues about 150 to 167 or SEQ ID NO:81;

[4691] (x) it has at least one, two, three, preferably four predicted protein kinase C phosphorylation sites (PS00005);

[4692] (xi) it has at least one, two, three, preferably four predicted casein kinase II phosphorylation sites (PS00006);

[4693] (xii) it has at least one predicted cAMP- and cGMP-dependent protein kinase phosphorylation site (PS00004);

[4694] (xiii) it has at least one, preferably two predicted amidation sites (PS00009); and

[4695] (xiv) it has at least one, two, three, four, preferably five predicted N-myristylation sites (PS00008).

[4696] In a preferred embodiment the 46689 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:81. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:81 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:81. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non-essential residue or a conservative substitution. In a preferred embodiment the differences are not in the α/β hydrolase domain, e.g., about amino acid residues 186 to 419 of SEQ ID NO:81. or the transmembrane domain, e.g., about amino acid residues 150 to 167 of SEQ ID NO:81. In another preferred embodiment one or more differences are in the α/β hydrolase domain, e.g., about amino acid residues 186 to 419 of SEQ ID NO: 81. or the transmembrane domain, e.g., about amino acid residues 150 to 167 of SEQ ID NO:81

[4697] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 46689 proteins differ in amino acid sequence from SEQ ID NO:81, yet retain biological activity.

[4698] In one embodiment, the protein includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:81.

[4699] A 46689 protein or fragment is provided which varies from the sequence of SEQ ID NO:81 in regions defined by amino acid residues about 27 to 149, 168 to 185, 242 to 350, and 400 to 468 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:81 in regions defined by amino acids about 150 to 167, 186 to 241, and 351 to 399. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[4700] In one embodiment, a biologically active portion of a 46689 protein includes an α/β hydrolase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 46689 protein.

[4701] In a preferred embodiment, the 46689 protein has an amino acid sequence shown in SEQ ID NO:81. In other embodiments, the 46689 protein is substantially identical to SEQ ID NO:81. In yet another embodiment, the 46689 protein is substantially identical to SEQ ID NO:81 and retains the functional activity of the protein of SEQ ID NO:81, as described in detail in the subsections above.

[4702] In a preferred embodiment, a fragment differs by at least 1, 2, 3, 10, 20, or more amino acid residues encoded by a sequence present in SEQ ID NO: 7258 of WO 01/57188 or SEQ ID NO: 790 or WO 00/52165. Differences can include differing in length or sequence identity. For example, a fragment can: include one or more amino acid residues from SEQ ID NO:81 outside the region encoded by nucleotides 246 to 799, 808 to 1519, 743 to 1113, 1115 to 1521, or 1523 to 2082 of SEQ ID NO:80; not include all of the amino acid residues encoded by a nucleotide sequence in SEQ ID NO: 7258 of WO 01/57188 or SEQ ID NO: 790 or WO 00/52165, e.g., can be one or more amino acid residues shorter (at one or both ends) than a sequence encoded by a nucleotide sequence in SEQ ID NO: 7258 of WO 01/57188 or SEQ ID NO: 790 or WO 00/52165; or can differ by one or more amino acid residues in the region of overlap.

[4703] 23479, 48120, or 46689 Chimeric or Fusion Proteins

[4704] In another aspect, the invention provides 23479, 48120, or 46689 chimeric or fusion proteins. As used herein, a 23479, 48120, or 46689 “chimeric protein” or “fusion protein” includes a 23479, 48120, or 46689 polypeptide linked to a non-23479, 48120, or 46689 polypeptide. A “non-23479, 48120, or 46689 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 23479, 48120, or 46689 protein, e.g., a protein which is different from the 23479, 48120, or 46689 protein and which is derived from the same or a different organism. The 23479, 48120, or 46689 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 23479, 48120, or 46689 amino acid sequence. In a preferred embodiment, a 23479, 48120, or 46689 fusion protein includes at least one (or two) biologically active portion of a 23479, 48120, or 46689 protein. The non-23479, 48120, or 46689 polypeptide can be fused to the N-terminus or C-terminus of the 23479, 48120, or 46689 polypeptide.

[4705] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-23479, 48120, or 46689 fusion protein in which the 23479, 48120, or 46689 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 23479, 48120, or 46689. Alternatively, the fusion protein can be a 23479, 48120, or 46689 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 23479, 48120, or 46689 can be increased through use of a heterologous signal sequence.

[4706] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[4707] The 23479, 48120, or 46689 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 23479, 48120, or 46689 fusion proteins can be used to affect the bioavailability of a 23479, 48120, or 46689 substrate. 23479, 48120, or 46689 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 23479, 48120, or 46689 protein; (ii) mis-regulation of the 23479, 48120, or 46689 gene; and (iii) aberrant post-translational modification of a 23479, 48120, or 46689 protein.

[4708] Moreover, the 23479, 48120, or 46689-fusion proteins of the invention can be used as immunogens to produce anti-23479, 48120, or 46689 antibodies in a subject, to purify 23479, 48120, or 46689 ligands and in screening assays to identify molecules which inhibit the interaction of 23479, 48120, or 46689 with a 23479, 48120, or 46689 substrate.

[4709] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 23479, 48120, or 46689-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 23479, 48120, or 46689 protein.

[4710] Variants of 23479, 48120, or 46689 Proteins

[4711] In another aspect, the invention also features a variant of a 23479, 48120, or 46689 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 23479, 48120, or 46689 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 23479, 48120, or 46689 protein. An agonist of the 23479, 48120, or 46689 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 23479, 48120, or 46689 protein. An antagonist of a 23479, 48120, or 46689 protein can inhibit one or more of the activities of the naturally occurring form of the 23479, 48120, or 46689 protein by, for example, competitively modulating a 23479, 48120, or 46689-mediated activity of a 23479, 48120, or 46689 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 23479, 48120, or 46689 protein.

[4712] Variants of a 23479, 48120, or 46689 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 23479, 48120, or 46689 protein for agonist or antagonist activity.

[4713] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 23479, 48120, or 46689 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 23479, 48120, or 46689 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[4714] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 23479, 48120, or 46689 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 23479, 48120, or 46689 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[4715] Cell based assays can be exploited to analyze a variegated 23479, 48120, or 46689 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 23479, 48120, or 46689 in a substrate-dependent manner. The transfected cells are then contacted with 23479, 48120, or 46689 and the effect of the expression of the mutant on signaling by the 23479, 48120, or 46689 substrate can be detected, e.g., by measuring changes in cellular proliferation. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 23479, 48120, or 46689 substrate, and the individual clones further characterized.

[4716] In another aspect, the invention features a method of making a 23479, 48120, or 46689 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 23479, 48120, or 46689 polypeptide, e.g., a naturally occurring 23479, 48120, or 46689 polypeptide. The method includes: altering the sequence of a 23479, 48120, or 46689 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[4717] In another aspect, the invention features a method of making a fragment or analog of a 23479, 48120, or 46689 polypeptide a biological activity of a naturally occurring 23479, 48120, or 46689 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 23479, 48120, or 46689 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[4718] Anti-23479, 48120, or 46689 Antibodies

[4719] In another aspect, the invention provides an anti-23479, 48120, or 46689 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[4720] The anti-23479, 48120, or 46689 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[4721] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[4722] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 23479, 48120, or 46689 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-23479, 48120, or 46689 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)₂ fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[4723] The anti-23479, 48120, or 46689 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[4724] Phage display and combinatorial methods for generating anti-23479, 48120, or 46689 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J. 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[4725] In one embodiment, the anti-23479, 48120, or 46689 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[4726] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[4727] An anti-23479, 48120, or 46689 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[4728] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[4729] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 23479, 48120, or 46689 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[4730] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[4731] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region that are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques. 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 23479, 48120, or 46689 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[4732] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[4733] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[4734] In preferred embodiments an antibody can be made by immunizing with purified 23479, 48120, or 46689 antigen, or a fragment thereof, e.g., a fragment described herein, membrane associated antigen, tissue, e.g., crude tissue preparations, whole cells, preferably living cells, lysed cells, or cell fractions, e.g., cytosol or membrane fractions.

[4735] A full-length 23479, 48120, or 46689 protein or, antigenic peptide fragment of 23479, 48120, or 46689 can be used as an immunogen or can be used to identify anti-23479, 48120, or 46689 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 23479, 48120, or 46689 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81 and encompasses an epitope of 23479, 48120, or 46689. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[4736] Fragments of 23479 or 48120 can be used, e.g., to characterize the specificity of an antibody or to make immunogens. For example, fragments of 23479 or 48120 which include residues about 275 to 290, 530 to 550, or 640 to 650 of SEQ ID NO:75 or residues about 120 to 155, 680 to 700, or 770 to 800 of SEQ ID NO:78 can be used to make antibodies against hydrophilic regions of the 23479 or 48120 protein. Similarly, fragments of 23479 or 48120 which include residues about 100 to 110, 295 to 310, or 920 to 930 of SEQ ID NO:75 or residues about 1040 to 1055 of SEQ ID NO:78 can be used to make an antibody against a hydrophobic region of the 23479 or 48120 protein; a fragment of 23479 or 48120 which includes residues about 296-327 of SEQ ID NO:75 or 162-193 of SEQ ID NO:78 can be used to make an antibody against the UCH-1 region of the 23479 or 48120 protein; a fragment of 23479 or 48120 which includes residues about 546-640 of SEQ ID NO:75 or 580-649 of SEQ ID NO:78 can be used to make an antibody against the UCH-2 region of the 23479 or 48120 protein; a fragment of 48120 which includes residues about 20-61 of SEQ ID NO:78 can be used to make an antibody against the UBA region of the 48120 protein; and a fragment of 48120 which includes residues about 96-113 of SEQ ID NO:78 can be used to make an antibody against the UIM region of the 48120 protein.

[4737] Similarly, fragments of 46689 can be used, e.g., to characterize the specificity of an antibody or to make immunogens. For example, fragments of 46689 which include residues about 34 to 51, about 333 to 347, or about 438 to 449 of SEQ ID NO:81 can be used to make antibodies against hydrophilic regions of the 46689 protein. Similarly, fragments of 46689 which include residues about 1 to 23, about 133 to 145, or about 150 to 168 of SEQ ID NO:81 can be used to make an antibody against a hydrophobic region of the 46689 protein; fragments of 46689 which include residues about 27 to 149 or about 168 to 468 of SEQ ID NO:81 can be used to make an antibody against an non-transmembrane region of the 46689 protein; or fragments of 46689 which include residues about 186 to 241 or about 350 to 400 of SEQ ID NO:81 can be used to make an antibody against the α/β hydrolase domain of the 46689 protein.

[4738] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[4739] Antibodies that bind only native 23479, 48120, or 46689 protein, only denatured or otherwise non-native 23479, 48120, or 46689 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies that bind to native, but not denatured 23479, 48120, or 46689 protein.

[4740] Preferred epitopes encompassed by the antigenic peptide are regions of 23479, 48120, or 46689 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 23479, 48120, or 46689 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 23479, 48120, or 46689 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[4741] In a preferred embodiment the antibody can bind to the extracellular portion of the 46689 protein, e.g., it can bind to a whole cell which expresses the 46689 protein. In another embodiment, the antibody binds an intracellular portion of the 46689 protein.

[4742] In preferred embodiments, antibodies can bind one or more of purified antigen, membrane associated antigen, tissue, e.g., tissue sections, whole cells, preferably living cells, lysed cells, cell fractions, e.g., cytosol or membrane fractions.

[4743] The anti-23479, 48120, or 46689 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 23479, 48120, or 46689 protein.

[4744] In a preferred embodiment the antibody has effector function and/or can fix complement. In other embodiments the antibody does not recruit effector cells; or fix complement.

[4745] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example, it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[4746] In a preferred embodiment, an anti-23479 or 48120 antibody alters (e.g., increases or decreases) the de-ubiquitination activity of a 23479 or 48120 polypeptide. For example, the antibody can bind at or in proximity to the active site, e.g., to an epitope that includes a residue located from about 550-567 of SEQ ID NO:75 or 584-601 of SEQ ID NO:78.

[4747] In another preferred embodiment, an anti-46689 antibody alters (e.g., increases or decreases) the hydrolase activity of a 46689 polypeptide. For example, the antibody can bind at or in proximity to the active site, e.g., to an epitope that includes a residue located from about 186 to 241 or 350 to 400 of SEQ ID NO:81.

[4748] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[4749] An anti-23479, 48120, or 46689 antibody (e.g., monoclonal antibody) can be used to isolate 23479, 48120, or 46689 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-23479, 48120, or 46689 antibody can be used to detect 23479, 48120, or 46689 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-23479, 48120, or 46689 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or ³H.

[4750] The invention also includes a nucleic acid which encodes an anti-23479, 48120, or 46689 antibody, e.g., an anti-23479, 48120, or 46689 antibody described herein. Also included are vectors which include the nucleic acid and cells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[4751] The invention also includes cell lines, e.g., hybridomas, which make an anti-23479, 48120, or 46689 antibody, e.g., an antibody described herein, and method of using said cells to make a 23479, 48120, or 46689 antibody.

[4752] 23479, 48120, and 46689 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[4753] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[4754] A vector can include a 23479, 48120, or 46689 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 23479, 48120, or 46689 proteins, mutant forms of 23479, 48120, or 46689 proteins, fusion proteins, and the like).

[4755] The recombinant expression vectors of the invention can be designed for expression of 23479, 48120, or 46689 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[4756] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[4757] Purified fusion proteins can be used in 23479, 48120, or 46689 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 23479, 48120, or 46689 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells that are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[4758] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[4759] The 23479, 48120, or 46689 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[4760] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[4761] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[4762] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the α-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[4763] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[4764] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 23479, 48120, or 46689 nucleic acid molecule within a recombinant expression vector or a 23479, 48120, or 46689 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[4765] A host cell can be any prokaryotic or eukaryotic cell. For example, a 23479, 48120, or 46689 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells (African green monkey kidney cells CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182)). Other suitable host cells are known to those skilled in the art.

[4766] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[4767] A host cell of the invention can be used to produce (i.e., express) a 23479, 48120, or 46689 protein. Accordingly, the invention further provides methods for producing a 23479, 48120, or 46689 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 23479, 48120, or 46689 protein has been introduced) in a suitable medium such that a 23479, 48120, or 46689 protein is produced. In another embodiment, the method further includes isolating a 23479, 48120, or 46689 protein from the medium or the host cell.

[4768] In another aspect, the invention features, a cell or purified preparation of cells which include a 23479, 48120, or 46689 transgene, or which otherwise misexpress 23479, 48120, or 46689. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 23479, 48120, or 46689 transgene, e.g., a heterologous form of a 23479, 48120, or 46689, e.g., a gene derived from humans (in the case of a non-human cell). The 23479, 48120, or 46689 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 23479, 48120, or 46689, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 23479, 48120, or 46689 alleles or for use in drug screening.

[4769] In another aspect, the invention features, a human cell, e.g., a hematopoietic or hepatic stem cell, transformed with nucleic acid which encodes a subject 23479, 48120, or 46689 polypeptide.

[4770] Also provided are cells, preferably human cells, e.g., human hematopoietic, hepatic, neural, lung, ovary, breast or fibroblast cells, in which an endogenous 23479, 48120, or 46689 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 23479, 48120, or 46689 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 23479, 48120, or 46689 gene. For example, an endogenous 23479, 48120, or 46689 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[4771] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 23479, 48120, or 46689 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 23479, 48120, or 46689 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 23479, 48120, or 46689 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[4772] 23479, 48120, and 46689 Transgenic Animals

[4773] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 23479, 48120, or 46689 protein and for identifying and/or evaluating modulators of 23479, 48120, or 46689 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 23479, 48120, or 46689 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[4774] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 23479, 48120, or 46689 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 23479, 48120, or 46689 transgene in its genome and/or expression of 23479, 48120, or 46689 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 23479, 48120, or 46689 protein can further be bred to other transgenic animals carrying other transgenes.

[4775] 23479, 48120, or 46689 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[4776] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[4777] Uses of 23479, 48120, and 46689

[4778] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[4779] The isolated nucleic acid molecules of the invention can be used, for example, to express a 23479, 48120, or 46689 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 23479, 48120, or 46689 mRNA (e.g., in a biological sample) or a genetic alteration in a 23479, 48120, or 46689 gene, and to modulate 23479, 48120, or 46689 activity, as described further below. The 23479, 48120, or 46689 proteins can be used to treat disorders characterized by insufficient or excessive production of a 23479, 48120, or 46689 substrate or production of 23479, 48120, or 46689 inhibitors. In addition, the 23479, 48120, or 46689 proteins can be used to screen for naturally occurring 23479, 48120, or 46689 substrates, to screen for drugs or compounds which modulate 23479, 48120, or 46689 activity, as well as to treat disorders characterized by insufficient or excessive production of 23479, 48120, or 46689 protein or production of 23479, 48120, or 46689 protein forms which have decreased, aberrant or unwanted activity compared to 23479, 48120, or 46689 wild type protein (e.g., a cellular proliferative or differentiative disorder, e.g., in the lung, brain, ovary, or breast; a neural disorder; a hematopoietic disorder; a cardiovascular disorder; or a liver disorder). Moreover, the anti-23479, 48120, or 46689 antibodies of the invention can be used to detect and isolate 23479, 48120, or 46689 proteins, regulate the bioavailability of 23479, 48120, or 46689 proteins, and modulate 23479, 48120, or 46689 activity.

[4780] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 23479, 48120, or 46689 polypeptide is provided. The method includes: contacting the compound with the subject 23479, 48120, or 46689 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 23479, 48120, or 46689 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 23479, 48120, or 46689 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 23479, 48120, or 46689 polypeptide. Screening methods are discussed in more detail below.

[4781] 23479, 48120, and 46689 Screening Assays

[4782] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 23479, 48120, or 46689 proteins, have a stimulatory or inhibitory effect on, for example, 23479, 48120, or 46689 expression or 23479, 48120, or 46689 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 23479, 48120, or 46689 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 23479, 48120, or 46689 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[4783] In one embodiment, the invention provides assays for screening candidate or test compounds that are substrates of a 23479, 48120, or 46689 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 23479, 48120, or 46689 protein or polypeptide or a biologically active portion thereof.

[4784] In one embodiment, an activity of a 23479 or 48120 protein can be assayed by measuring or detecting 23479- or 48120-mediated de-ubiquitination. De-ubiquitination assays useful for detecting a ubiquitin carboxyl-terminal hydrolase activity are well known in the art and can be found, for example, in Zhu et al. (1997) Journal of Biological Chemistry 272:51-57, Mitch et al. (1999) American Journal of Physiology 276:C1132-C1138, Liu et al. (1999) Molecular and Cell Biology 19:3029-3038, and such as those cited in various reviews, for example, Ciechanover et al. (1994) The FASEB Journal 8:182-192, Chiechanover (1994) Biol. Chem. Hoppe-Seyler 375:565-581, Hershko et al. (1998) Annual Review of Biochemistry 67:425-479, Swartz (1999) Annual Review of Medicine 50:57-74, Ciechanover (1998) EMBO Journal 17:7151-7160, and D'Andrea et al. (1998) Critical Reviews in Biochemistry and Molecular Biology 33:337-352. These assays include, but are not limited to, the disappearance of substrate, including a decrease in the amount of polyubiquitin or ubiquitinated substrate protein or protein remnant, appearance of intermediate and end products, such as appearance of free ubiquitin monomers, general protein turnover, specific protein turnover, ubiquitin binding, binding to ubiquitinated substrate protein, subunit interaction, interaction with ATP, interaction with cellular components such as trans-acting regulatory factors, stabilization of specific proteins, and the like.

[4785] In one embodiment, an activity of a 46689 protein can be assayed in vitro. First, 46689 protein can be expressed in a bacterial cell and then purified, e.g., by means of an covalently attached affinity tag, e.g., a His-6 tag. Purified 46689 can then be incubated in buffer, e.g., 100 mM Tris-HCL, pH 8.5, along with a substrate molecule, e.g., a protein, lipid or small molecule, e.g., steroid, signaling molecule, toxin, or carcinogen, substrate. By measuring the optical density of the solution at various time points, the loss of substrate or increase in product can be monitored, which is a direct measure of 46689 activity. The appropriate wavelength used to measure optical density will depend upon the light absorption spectra of the substrate and product molecules. An example of such as assay, used to monitor the activity of a 2-Hydroxymuconic Semialdehyde Dehydrogenase enzyme, is provided in Inoue et al. (1995), J of Bacteriology 177(5):1196-1201, the contents of which are incorporated herein by reference.

[4786] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[4787] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[4788] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[4789] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 23479, 48120, or 46689 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 23479, 48120, or 46689 activity is determined. Determining the ability of the test compound to modulate 23479, 48120, or 46689 activity can be accomplished by monitoring, for example, hydrolase activity, e.g., the hydrolysis of a substrate, e.g., a protein (e.g., a ubiquitinated protein or poly-ubiquitin), lipid, or small molecule, e.g., steroid, signaling molecule, toxin, or carcinogen, substrate. The cell, for example, can be of mammalian origin, e.g., human.

[4790] The ability of the test compound to modulate 23479, 48120, or 46689 binding to a compound, e.g., a 23479, 48120, or 46689 substrate, or to bind to 23479, 48120, or 46689 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 23479, 48120, or 46689 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 23479, 48120, or 46689 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 23479, 48120, or 46689 binding to a 23479, 48120, or 46689 substrate in a complex. For example, compounds (e.g., 23479, 48120, or 46689 substrates) can be labeled with ¹²⁵I, ³⁵S, ¹⁴C, or ³H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[4791] The ability of a compound (e.g., a 23479, 48120, or 46689 substrate) to interact with 23479, 48120, or 46689 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 23479, 48120, or 46689 without the labeling of either the compound or the 23479, 48120, or 46689. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 23479, 48120, or 46689.

[4792] In yet another embodiment, a cell-free assay is provided in which a 23479, 48120, or 46689 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 23479, 48120, or 46689 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 23479, 48120, or 46689 proteins to be used in assays of the present invention include fragments which participate in interactions with non-23479, 48120, or 46689 molecules, e.g., fragments with high surface probability scores.

[4793] Soluble and/or membrane-bound forms of isolated proteins (e.g., 23479, 48120, or 46689 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)_(n), 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[4794] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[4795] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[4796] In another embodiment, determining the ability of the 23479, 48120, or 46689 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[4797] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[4798] It may be desirable to immobilize either 23479, 48120, or 46689, an anti-23479, 48120, or 46689 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 23479, 48120, or 46689 protein, or interaction of a 23479, 48120, or 46689 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/23479, 48120, or 46689 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 23479, 48120, or 46689 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 23479, 48120, or 46689 binding or activity determined using standard techniques.

[4799] Other techniques for immobilizing either a 23479, 48120, or 46689 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 23479, 48120, or 46689 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[4800] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[4801] In one embodiment, this assay is performed utilizing antibodies reactive with 23479, 48120, or 46689 protein or target molecules but which do not interfere with binding of the 23479, 48120, or 46689 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 23479, 48120, or 46689 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 23479, 48120, or 46689 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 23479, 48120, or 46689 protein or target molecule.

[4802] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11: 141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[4803] In a preferred embodiment, the assay includes contacting the 23479, 48120, or 46689 protein or biologically active portion thereof with a known compound which binds 23479, 48120, or 46689 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 23479, 48120, or 46689 protein, wherein determining the ability of the test compound to interact with a 23479, 48120, or 46689 protein includes determining the ability of the test compound to preferentially bind to 23479, 48120, or 46689 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[4804] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 23479, 48120, or 46689 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 23479, 48120, or 46689 protein through modulation of the activity of a downstream effector of a 23479, 48120, or 46689 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[4805] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[4806] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[4807] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[4808] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[4809] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[4810] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[4811] In yet another aspect, the 23479, 48120, or 46689 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 23479, 48120, or 46689 (“23479, 48120, or 46689-binding proteins” or “23479, 48120, or 46689-bp”) and are involved in 23479, 48120, or 46689 activity. Such 23479, 48120, or 46689-bps can be activators or inhibitors of signals by the 23479, 48120, or 46689 proteins or 23479, 48120, or 46689 targets as, for example, downstream elements of a 23479, 48120, or 46689-mediated signaling pathway.

[4812] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 23479, 48120, or 46689 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 23479, 48120, or 46689 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 23479, 48120, or 46689-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 23479, 48120, or 46689 protein.

[4813] In another embodiment, modulators of 23479, 48120, or 46689 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 23479, 48120, or 46689 mRNA or protein evaluated relative to the level of expression of 23479, 48120, or 46689 mRNA or protein in the absence of the candidate compound. When expression of 23479, 48120, or 46689 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 23479, 48120, or 46689 mRNA or protein expression. Alternatively, when expression of 23479, 48120, or 46689 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 23479, 48120, or 46689 mRNA or protein expression. The level of 23479, 48120, or 46689 mRNA or protein expression can be determined by methods described herein for detecting 23479, 48120, or 46689 mRNA or protein.

[4814] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 23479, 48120, or 46689 protein can be confirmed in vivo, e.g., in an animal such as an animal model for a cellular proliferative or differentiative disorder.

[4815] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 23479, 48120, or 46689 modulating agent, an antisense 23479, 48120, or 46689 nucleic acid molecule, a 23479, 48120, or 46689-specific antibody, or a 23479, 48120, or 46689-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[4816] 23479, 48120, and 46689 Detection Assays

[4817] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 23479, 48120, or 46689 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[4818] 23479, 48120, and 46689 Chromosome Mapping

[4819] The 23479, 48120, or 46689 nucleotide sequences or portions thereof can be used to map the location of the 23479, 48120, or 46689 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 23479, 48120, or 46689 sequences with genes associated with disease.

[4820] Briefly, 23479, 48120, or 46689 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 23479, 48120, or 46689 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 23479, 48120, or 46689 sequences will yield an amplified fragment.

[4821] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[4822] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 23479, 48120, or 46689 to a chromosomal location.

[4823] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[4824] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[4825] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[4826] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 23479, 48120, or 46689 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[4827] 23479, 48120, and 46689 Tissue Typing

[4828] 23479, 48120, or 46689 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[4829] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 23479, 48120, or 46689 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[4830] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:74 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:76 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[4831] If a panel of reagents from 23479, 48120, or 46689 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[4832] Use of Partial 23479, 48120, or 46689 Sequences in Forensic Biology

[4833] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[4834] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:74 (e.g., fragments derived from the noncoding regions of SEQ ID NO:74 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[4835] The 23479, 48120, or 46689 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 23479, 48120, or 46689 probes can be used to identify tissue by species and/or by organ type.

[4836] In a similar fashion, these reagents, e.g., 23479, 48120, or 46689 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[4837] Predictive Medicine of 23479, 48120, and 46689

[4838] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[4839] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 23479, 48120, or 46689.

[4840] Such disorders can include, e.g., a disorder associated with the misexpression of a 23479 or 48120 gene; a disorder associated with abnormal de-ubiquitination activity; and a disorder associated with abnormal protein degradation activity. Particularly preferred disorders include cellular proliferative and/or differentiative disorders, cardiovascular disorders, brain disorders. For example, preferred disorders include atherosclerosis, disorders associated with oxidative damage, cellular oxidative stress-related glucocorticoid responsiveness, and in disorders characterized by unwanted free radicals, e.g., in ischaemia reperfusion injury. Alternatively, the disorders can include, e.g., a disorder associated with the misexpression of a 46689 gene; a disorder associated with abnormal hydorlase activity, e.g., a metabolic disorder; a cellular proliferative or differentiative disorder, e.g., in the lung, brain, ovary, or breast, a neural disorder, a hematopoietic disorder, or a liver disorder.

[4841] The method includes one or more of the following:

[4842] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 23479, 48120, or 46689 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[4843] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 23479, 48120, or 46689 gene;

[4844] detecting, in a tissue of the subject, the misexpression of the 23479, 48120, or 46689 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[4845] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 23479, 48120, or 46689 polypeptide.

[4846] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 23479, 48120, or 46689 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[4847] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:74, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 23479, 48120, or 46689 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[4848] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 23479, 48120, or 46689 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 23479, 48120, or 46689.

[4849] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[4850] In preferred embodiments the method includes determining the structure of a 23479, 48120, or 46689 gene, an abnormal structure being indicative of risk for the disorder.

[4851] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 23479, 48120, or 46689 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[4852] Diagnostic and Prognostic Assays of 23479, 48120, and 46689

[4853] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 23479, 48120, or 46689 molecules and for identifying variations and mutations in the sequence of 23479, 48120, or 46689 molecules.

[4854] Expression Monitoring and Profiling:

[4855] The presence, level, or absence of 23479, 48120, or 46689 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 23479, 48120, or 46689 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 23479, 48120, or 46689 protein such that the presence of 23479, 48120, or 46689 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 23479, 48120, or 46689 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 23479, 48120, or 46689 genes; measuring the amount of protein encoded by the 23479, 48120, or 46689 genes; or measuring the activity of the protein encoded by the 23479, 48120, or 46689 genes.

[4856] The level of mRNA corresponding to the 23479, 48120, or 46689 gene in a cell can be determined both by in situ and by in vitro formats.

[4857] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 23479, 48120, or 46689 nucleic acid, such as the nucleic acid of SEQ ID NO:74, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 23479, 48120, or 46689 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[4858] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 23479, 48120, or 46689 genes.

[4859] The level of mRNA in a sample that is encoded by one of 23479, 48120, or 46689 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[4860] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 23479, 48120, or 46689 gene being analyzed.

[4861] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 23479, 48120, or 46689 mRNA, or genomic DNA, and comparing the presence of 23479, 48120, or 46689 mRNA or genomic DNA in the control sample with the presence of 23479, 48120, or 46689 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 23479, 48120, or 46689 transcript levels.

[4862] A variety of methods can be used to determine the level of protein encoded by 23479, 48120, or 46689. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)₂) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[4863] The detection methods can be used to detect 23479, 48120, or 46689 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 23479, 48120, or 46689 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 23479, 48120, or 46689 protein include introducing into a subject a labeled anti-23479, 48120, or 46689 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-23479, 48120, or 46689 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[4864] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 23479, 48120, or 46689 protein, and comparing the presence of 23479, 48120, or 46689 protein in the control sample with the presence of 23479, 48120, or 46689 protein in the test sample.

[4865] The invention also includes kits for detecting the presence of 23479, 48120, or 46689 in a biological sample. For example, the kit can include a compound or agent capable of detecting 23479, 48120, or 46689 protein or mRNA in a biological sample; and a standard.

[4866] The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 23479, 48120, or 46689 protein or nucleic acid.

[4867] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[4868] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples that can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[4869] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 23479, 48120, or 46689 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as a cellular proliferative or differentiative disorder, e.g., in the lung, brain, ovary, or breast, as well as a neural disorder, a hematopoietic disorder, a cardiovascular disorder, or a liver disorder.

[4870] In one embodiment, a disease or disorder associated with aberrant or unwanted 23479, 48120, or 46689 expression or activity is identified. A test sample is obtained from a subject and 23479, 48120, or 46689 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 23479, 48120, or 46689 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 23479, 48120, or 46689 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[4871] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 23479, 48120, or 46689 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for, e.g., a cellular proliferative or differentiative disorder.

[4872] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 23479, 48120, or 46689 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 23479, 48120, or 46689 (e.g., other genes associated with a 23479, 48120, or 46689-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[4873] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 23479, 48120, or 46689 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose, e.g., a cellular proliferative or differentiative disorder, e.g., in the lung, brain, ovary, or breast, in a subject wherein an increase in 23479, 48120, or 46689 expression is an indication that the subject has or is disposed to having such a disorder. The method can be used to monitor a treatment for a disorder, e.g., a cellular proliferative or differentiative disorder, in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[4874] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 23479, 48120, or 46689 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[4875] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 23479, 48120, or 46689 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[4876] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[4877] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 23479, 48120, or 46689 expression.

[4878] 23479, 48120, and 46689 Arrays and Uses Thereof

[4879] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 23479, 48120, or 46689 molecule (e.g., a 23479, 48120, or 46689 nucleic acid or a 23479, 48120, or 46689 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm², and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[4880] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 23479, 48120, or 46689 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 23479, 48120, or 46689. Each address of the subset can include a capture probe that hybridizes to a different region of a 23479, 48120, or 46689 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 23479, 48120, or 46689 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 23479, 48120, or 46689 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 23479, 48120, or 46689 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[4881] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[4882] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 23479, 48120, or 46689 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 23479, 48120, or 46689 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-23479, 48120, or 46689 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[4883] In another aspect, the invention features a method of analyzing the expression of 23479, 48120, or 46689. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 23479, 48120, or 46689-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[4884] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 23479, 48120, or 46689. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 23479, 48120, or 46689. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[4885] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 23479, 48120, or 46689 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[4886] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[4887] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 23479, 48120, or 46689-associated disease or disorder; and processes, such as a cellular transformation associated with a 23479, 48120, or 46689-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 23479, 48120, or 46689-associated disease or disorder

[4888] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 23479, 48120, or 46689) that could serve as a molecular target for diagnosis or therapeutic intervention.

[4889] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 23479, 48120, or 46689 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 23479, 48120, or 46689 polypeptide or fragment thereof. For example, multiple variants of a 23479, 48120, or 46689 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[4890] The polypeptide array can be used to detect a 23479, 48120, or 46689 binding compound, e.g., an antibody in a sample from a subject with specificity for a 23479, 48120, or 46689 polypeptide or the presence of a 23479, 48120, or 46689-binding protein or ligand.

[4891] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 23479, 48120, or 46689 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[4892] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 23479, 48120, or 46689 or from a cell or subject in which a 23479, 48120, or 46689 mediated response has been elicited, e.g., by contact of the cell with 23479, 48120, or 46689 nucleic acid or protein, or administration to the cell or subject 23479, 48120, or 46689 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 23479, 48120, or 46689 (or does not express as highly as in the case of the 23479, 48120, or 46689 positive plurality of capture probes) or from a cell or subject which in which a 23479, 48120, or 46689 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 23479, 48120, or 46689 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[4893] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 23479, 48120, or 46689 or from a cell or subject in which a 23479, 48120, or 46689-mediated response has been elicited, e.g., by contact of the cell with 23479, 48120, or 46689 nucleic acid or protein, or administration to the cell or subject 23479, 48120, or 46689 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 23479, 48120, or 46689 (or does not express as highly as in the case of the 23479, 48120, or 46689 positive plurality of capture probes) or from a cell or subject which in which a 23479, 48120, or 46689 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[4894] In another aspect, the invention features a method of analyzing 23479, 48120, or 46689, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 23479, 48120, or 46689 nucleic acid or amino acid sequence; comparing the 23479, 48120, or 46689 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 23479, 48120, or 46689.

[4895] Detection of 23479, 48120, and 46689 Variations or Mutations

[4896] The methods of the invention can also be used to detect genetic alterations in a 23479, 48120, or 46689 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 23479, 48120, or 46689 protein activity or nucleic acid expression, such as a cellular proliferative or differentiative disorder, e.g., in the lung, brain, ovary, or breast, a neural disorder, a hematopoietic disorder, a cardiovascular disorder, or a liver disorder. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 23479, 48120, or 46689-protein, or the mis-expression of the 23479, 48120, or 46689 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 23479, 48120, or 46689 gene; 2) an addition of one or more nucleotides to a 23479, 48120, or 46689 gene; 3) a substitution of one or more nucleotides of a 23479, 48120, or 46689 gene, 4) a chromosomal rearrangement of a 23479, 48120, or 46689 gene; 5) an alteration in the level of a messenger RNA transcript of a 23479, 48120, or 46689 gene, 6) aberrant modification of a 23479, 48120, or 46689 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 23479, 48120, or 46689 gene, 8) a non-wild type level of a 23479, 48120, or 46689-protein, 9) allelic loss of a 23479, 48120, or 46689 gene, and 10) inappropriate post-translational modification of a 23479, 48120, or 46689-protein.

[4897] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 23479, 48120, or 46689-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 23479, 48120, or 46689 gene under conditions such that hybridization and amplification of the 23479, 48120, or 46689-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[4898] In another embodiment, mutations in a 23479, 48120, or 46689 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[4899] In other embodiments, genetic mutations in 23479, 48120, or 46689 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 23479, 48120, or 46689 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 23479, 48120, or 46689 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 23479, 48120, or 46689 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[4900] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 23479, 48120, or 46689 gene and detect mutations by comparing the sequence of the sample 23479, 48120, or 46689 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[4901] Other methods for detecting mutations in the 23479, 48120, or 46689 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[4902] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 23479, 48120, or 46689 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[4903] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 23479, 48120, or 46689 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 23479, 48120, or 46689 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[4904] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[4905] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[4906] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[4907] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 23479, 48120, or 46689 nucleic acid.

[4908] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:74, SEQ ID NO:77, SEQ ID NO:80, or the complement of SEQ ID NO:74, SEQ ID NO:77, or SEQ ID NO:80. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[4909] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 23479, 48120, or 46689. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[4910] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the T_(m) of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[4911] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 23479, 48120, or 46689 nucleic acid.

[4912] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 23479, 48120, or 46689 gene.

[4913] Use of 23479, 48120, or 46689 Molecules as Surrogate Markers

[4914] The 23479, 48120, or 46689 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 23479, 48120, or 46689 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 23479, 48120, or 46689 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[4915] The 23479, 48120, or 46689 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 23479, 48120, or 46689 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-23479, 48120, or 46689 antibodies may be employed in an immune-based detection system for a 23479, 48120, or 46689 protein marker, or 23479, 48120, or 46689-specific radiolabeled probes may be used to detect a 23479, 48120, or 46689 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[4916] The 23479, 48120, or 46689 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker that correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 23479, 48120, or 46689 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 23479, 48120, or 46689 DNA may correlate 23479, 48120, or 46689 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[4917] Pharmaceutical Compositions of 23479, 48120, and 46689

[4918] The nucleic acid and polypeptides, fragments thereof, as well as anti-23479, 48120, or 46689 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[4919] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[4920] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[4921] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[4922] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[4923] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[4924] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[4925] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[4926] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[4927] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[4928] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[4929] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[4930] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[4931] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[4932] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[4933] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[4934] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, CC-1065, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine, vinblastine, taxol and maytansinoids). Radioactive ions include, but are not limited to iodine, yttrium and praseodymium.

[4935] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, α-interferon, β-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[4936] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[4937] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[4938] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[4939] Methods of Treatment for 23479, 48120, and 46689

[4940] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 23479, 48120, or 46689 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[4941] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 23479, 48120, or 46689 molecules of the present invention or 23479, 48120, or 46689 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[4942] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 23479, 48120, or 46689 expression or activity, by administering to the subject a 23479, 48120, or 46689 or an agent which modulates 23479, 48120, or 46689 expression or at least one 23479, 48120, or 46689 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 23479, 48120, or 46689 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 23479, 48120, or 46689 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 23479, 48120, or 46689 aberrance, for example, a 23479, 48120, or 46689, 23479, 48120, or 46689 agonist or 23479, 48120, or 46689 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[4943] It is possible that some 23479, 48120, or 46689 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[4944] The 23479, 48120, and 46689 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative or differentiative disorders, e.g., in the lung, brain, ovary, or breast, neural disorders, hematopoietic disorders, cardiovascular disorders, or liver disorders, as discussed above. The 23479, 48120, and 46689 molecules can also act as novel diagnostic targets and therapeutic agents for controlling one or more disorders associated with bone metabolism, viral diseases, and pain or metabolic disorders.

[4945] Aberrant expression and/or activity of 23479, 48120, or 46689 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 23479, 48120, or 46689 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 23479, 48120, or 46689 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 23479, 48120, or 46689 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[4946] The 23479, 48120, or 46689 molecules of the invention may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 23479, 48120, or 46689 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 23479, 48120, or 46689 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[4947] Additionally, 23479, 48120, or 46689 may play an important role in the regulation of metabolism or pain disorders. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[4948] As discussed, successful treatment of 23479, 48120, or 46689 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 23479, 48120, or 46689 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)₂ and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[4949] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[4950] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[4951] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 23479, 48120, or 46689 expression is through the use of aptamer molecules specific for 23479, 48120, or 46689 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 23479, 48120, or 46689 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[4952] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 23479, 48120, or 46689 disorders. For a description of antibodies, see the Antibody section above.

[4953] In circumstances wherein injection of an animal or a human subject with a 23479, 48120, or 46689 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 23479, 48120, or 46689 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 23479, 48120, or 46689 protein. Vaccines directed to a disease characterized by 23479, 48120, or 46689 expression may also be generated in this fashion.

[4954] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[4955] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 23479, 48120, or 46689 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[4956] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED₅₀ with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC₅₀ (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography. Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 23479, 48120, or 46689 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 23479, 48120, or 46689 can be readily monitored and used in calculations of IC₅₀.

[4957] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC₅₀. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[4958] Another aspect of the invention pertains to methods of modulating 23479, 48120, or 46689 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 23479, 48120, or 46689 or agent that modulates one or more of the activities of 23479, 48120, or 46689 protein activity associated with the cell. An agent that modulates 23479, 48120, or 46689 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 23479, 48120, or 46689 protein (e.g., a 23479, 48120, or 46689 substrate or receptor), a 23479, 48120, or 46689 antibody, a 23479, 48120, or 46689 agonist or antagonist, a peptidomimetic of a 23479, 48120, or 46689 agonist or antagonist, or other small molecule.

[4959] In one embodiment, the agent stimulates one or 23479, 48120, or 46689 activities. Examples of such stimulatory agents include active 23479, 48120, or 46689 protein and a nucleic acid molecule encoding 23479, 48120, or 46689. In another embodiment, the agent inhibits one or more 23479, 48120, or 46689 activities. Examples of such inhibitory agents include antisense 23479, 48120, or 46689 nucleic acid molecules, anti-23479, 48120, or 46689 antibodies, and 23479, 48120, or 46689 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 23479, 48120, or 46689 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 23479, 48120, or 46689 expression or activity. In another embodiment, the method involves administering a 23479, 48120, or 46689 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 23479, 48120, or 46689 expression or activity.

[4960] Stimulation of 23479, 48120, or 46689 activity is desirable in situations in which 23479, 48120, or 46689 is abnormally downregulated and/or in which increased 23479, 48120, or 46689 activity is likely to have a beneficial effect. For example, stimulation of 23479, 48120, or 46689 activity is desirable in situations in which a 23479, 48120, or 46689 is downregulated and/or in which increased 23479, 48120, or 46689 activity is likely to have a beneficial effect. Likewise, inhibition of 23479, 48120, or 46689 activity is desirable in situations in which 23479, 48120, or 46689 is abnormally upregulated and/or in which decreased 23479, 48120, or 46689 activity is likely to have a beneficial effect.

[4961] 23479, 48120, and 46689 Pharmacogenomics

[4962] The 23479, 48120, or 46689 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 23479, 48120, or 46689 activity (e.g., 23479, 48120, or 46689 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically)23479, 48120, or 46689 associated disorders (e.g., cellular proliferative or differentiative disorders, e.g., in the lung, brain, ovary, or breast, neural disorders, hematopoietic disorders, cardiovascular disorders, or liver disorders) associated with aberrant or unwanted 23479, 48120, or 46689 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 23479, 48120, or 46689 molecule or 23479, 48120, or 46689 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 23479, 48120, or 46689 molecule or 23479, 48120, or 46689 modulator.

[4963] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[4964] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[4965] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 23479, 48120, or 46689 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[4966] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 23479, 48120, or 46689 molecule or 23479, 48120, or 46689 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[4967] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 23479, 48120, or 46689 molecule or 23479, 48120, or 46689 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[4968] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 23479, 48120, or 46689 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 23479, 48120, or 46689 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[4969] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 23479, 48120, or 46689 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 23479, 48120, or 46689 gene expression, protein levels, or upregulate 23479, 48120, or 46689 activity, can be monitored in clinical trials of subjects exhibiting decreased 23479, 48120, or 46689 gene expression, protein levels, or downregulated 23479, 48120, or 46689 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 23479, 48120, or 46689 gene expression, protein levels, or downregulate 23479, 48120, or 46689 activity, can be monitored in clinical trials of subjects exhibiting increased 23479, 48120, or 46689 gene expression, protein levels, or upregulated 23479, 48120, or 46689 activity. In such clinical trials, the expression or activity of a 23479, 48120, or 46689 gene, and preferably, other genes that have been implicated in, for example, a 23479, 48120, or 46689-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[4970] 23479, 48120, or 46689 Informatics

[4971] The sequence of a 23479, 48120, or 46689 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 23479, 48120, or 46689. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 23479, 48120, or 46689 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[4972] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[4973] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[4974] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[4975] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[4976] Thus, in one aspect, the invention features a method of analyzing 23479, 48120, or 46689, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 23479, 48120, or 46689 nucleic acid or amino acid sequence; comparing the 23479, 48120, or 46689 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 23479, 48120, or 46689. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[4977] The method can include evaluating the sequence identity between a 23479, 48120, or 46689 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[4978] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[4979] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[4980] Thus, the invention features a method of making a computer readable record of a sequence of a 23479, 48120, or 46689 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[4981] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 23479, 48120, or 46689 sequence, or record, in machine-readable form; comparing a second sequence to the 23479, 48120, or 46689 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 23479, 48120, or 46689 sequence includes a sequence being compared. In a preferred embodiment the 23479, 48120, or 46689 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 23479, 48120, or 46689 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[4982] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 23479, 48120, or 46689-associated disease or disorder or a pre-disposition to a 23479, 48120, or 46689-associated disease or disorder, wherein the method comprises the steps of determining 23479, 48120, or 46689 sequence information associated with the subject and based on the 23479, 48120, or 46689 sequence information, determining whether the subject has a 23479, 48120, or 46689-associated disease or disorder or a pre-disposition to a 23479, 48120, or 46689-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[4983] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 23479, 48120, or 46689-associated disease or disorder or a pre-disposition to a disease associated with a 23479, 48120, or 46689 wherein the method comprises the steps of determining 23479, 48120, or 46689 sequence information associated with the subject, and based on the 23479, 48120, or 46689 sequence information, determining whether the subject has a 23479, 48120, or 46689-associated disease or disorder or a pre-disposition to a 23479, 48120, or 46689-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 23479, 48120, or 46689 sequence of the subject to the 23479, 48120, or 46689 sequences in the database to thereby determine whether the subject as a 23479, 48120, or 46689-associated disease or disorder, or a pre-disposition for such.

[4984] The present invention also provides in a network, a method for determining whether a subject has a 23479, 48120, or 46689 associated disease or disorder or a pre-disposition to a 23479, 48120, or 46689-associated disease or disorder associated with 23479, 48120, or 46689, said method comprising the steps of receiving 23479, 48120, or 46689 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 23479, 48120, or 46689 and/or corresponding to a 23479, 48120, or 46689-associated disease or disorder (e.g., a cellular proliferative or differentiative disorder, e.g., in the lung, brain, ovary, or breast, a neural disorder, a hematopoietic disorder, a cardiovascular disorder, or a liver disorder), and based on one or more of the phenotypic information, the 23479, 48120, or 46689 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 23479, 48120, or 46689-associated disease or disorder or a pre-disposition to a 23479, 48120, or 46689-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[4985] The present invention also provides a method for determining whether a subject has a 23479, 48120, or 46689-associated disease or disorder or a pre-disposition to a 23479, 48120, or 46689-associated disease or disorder, said method comprising the steps of receiving information related to 23479, 48120, or 46689 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 23479, 48120, or 46689 and/or related to a 23479, 48120, or 46689-associated disease or disorder, and based on one or more of the phenotypic information, the 23479, 48120, or 46689 information, and the acquired information, determining whether the subject has a 23479, 48120, or 46689-associated disease or disorder or a pre-disposition to a 23479, 48120, or 46689-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[4986] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 80091 Invention

[4987] Living cells are capable of modulating the levels of proteins that they express. A variety of different mechanisms exist through which protein levels can be modulated. The ubiquitin pathway is one example of a post-translational mechanism used to regulate protein levels. Ubiquitin is a highly conserved polypeptide (8.6 kDa) expressed in all eukaryotic cells that marks proteins for degradation. Ubiquitin is attached as a single molecule or as a conjugated form to lysine residue(s) of proteins via formation of an isopeptide bond at the C-terminal glycine residue. Most ubiquitinated proteins are subsequently targeted to the 26S proteasome, a multicatalytic protease, which cleaves the marked protein into peptide fragments.

[4988] Only the protein conjugated to ubiquitin is degraded via the proteasome; ubiquitin itself is recycled by ubiquitin carboxy-terminal hydrolases (UCH; sometimes abbreviated UCTH), which cleave the bond between ubiquitin and the protein targeted for degradation. These enzymes constitute a family of thiol proteases, and homologues have been found in, for example, yeast (Miller et al., (1989) Bio Technology 7: 698-704; Tobias and Varshavsky (1991) J. Biol. Chem. 266: 12021-12028; Baker et al., (1992) J. Biol. Chem. 267: 23364-23375), bovine (Papa and Hochstrasser (1993) Nature 366: 313-319), avian (Woo et al., (1995) J. Biol. Chem. 270: 18766-18773), Drosophila (Zhang et al., (1993) Dev. Biol. 17: 214) and human (Wilkinson et al., (1989) Science 246:670) cells.

[4989] Ubiquitination has been implicated in regulating numerous cellular processes including, for example, proliferation, differentiation, apoptosis (programmed cell death), transcription, signal-transduction, cell-cycle progression, receptor-mediated endocytosis, organelle biogenesis and others. The presence of abnormal amounts of ubiquitinated proteins in neuropathological conditions such as Alzheimer's and Pick's disease indicates that ubiquitination plays a role in various physiological disorders. Oncogenes (e.g., v-jun and v-fos) are often found to be resistant to ubiquitination in comparison to their normal cell counterparts, suggesting that a failure to degrade oncogene protein products accounts for some of their cell transformation capability. Combined with the observation that not all ubiquitinated proteins are degraded by the proteosome, these findings indicate that the process of ubiquitination and de-ubiquitination of particular substrates have important functional roles apart from recycling ubiquitin.

[4990] There are two distinct families of UCH. The second class consists of large proteins (800 to 2000 residues) and these proteins only share two domains of similarity (UCH-1 and UCH-2). The UCH-1 domain contains a conserved cysteine that is probably implicated in the catalytic mechanism. The UCH-2 domain contains two conserved histidines residues, one of which is also probably implicated in the catalytic mechanism. The conserved signature patterns of UCH-1 and UCH-2 are respectively as follows: (1) G-[LIVMFY]-x(1,3)-[AGC]-[NASM]-x-C-[FYW]-[LIVMFC]-[NST]-[SACV]-x-[LIVMS]-Q, wherein C is the putative active site residue; and (2) Y-x-L-x-[SAG]-[LIVMFT]-x(2)-H-x-G-x(4,5)-G-H-Y (SEQ ID NO:98), wherein Hs are two putative active site residues.

[4991] Summary of the 80091 Invention

[4992] The present invention is based, in part, on the discovery of a novel ubiquitin carboxy-terminal hydrolase family member, referred to herein as “80091”. The nucleotide sequence of a cDNA encoding 80091 is shown in SEQ ID NO:94, and the amino acid sequence of an 80091 polypeptide is shown in SEQ ID NO:95. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:94.

[4993] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes an 80091 protein or polypeptide, e.g., a biologically active portion of the 80091 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:95. In other embodiments, the invention provides isolated 80091 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:94, a full complement of SEQ ID NO:94, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:94, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:94, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 80091 protein or an active fragment thereof.

[4994] In a related aspect, the invention further provides nucleic acid constructs that include an 80091 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 80091 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 80091 nucleic acid molecules and polypeptides.

[4995] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 80091-encoding nucleic acids.

[4996] In still another related aspect, isolated nucleic acid molecules that are antisense to an 80091 encoding nucleic acid molecule are provided.

[4997] In another aspect, the invention features, 80091 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 80091-mediated or -related disorders. In another embodiment, the invention provides 80091 polypeptides having an 80091 activity. Preferred polypeptides are 80091 proteins including at least one ubiquitin carboxyl-terminal hydrolase domain and one ubiquitin carboxyl-terminal hydrolase-2 domain, and, preferably, having an 80091 activity, e.g., an 80091 activity as described herein.

[4998] In other embodiments, the invention provides 80091 polypeptides, e.g., an 80091 polypeptide having the amino acid sequence shown in SEQ ID NO:95 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:95 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:94, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 80091 protein or an active fragment thereof.

[4999] In a related aspect, the invention further provides nucleic acid constructs which include an 80091 nucleic acid molecule described herein.

[5000] In a related aspect, the invention provides 80091 polypeptides or fragments operatively linked to non-80091 polypeptides to form fusion proteins.

[5001] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 80091 polypeptides or fragments thereof, e.g., an ubiquitin carboxyl-terminal hydrolase-1 domain or an ubiquitin carboxyl-terminal hydrolase-2 domain.

[5002] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 80091 polypeptides or nucleic acids.

[5003] In still another aspect, the invention provides a process for modulating 80091 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 80091 polypeptides or nucleic acids, such as conditions involving aberrant or deficient involving aberrant cellular proliferation of an 80091 expressing cell, e.g., a hematopoietic cell (e.g., an erythroid cell (e.g., an erythrocyte or an erythroblast), and cellular proliferation or differentiation.

[5004] The invention also provides assays for determining the activity of or the presence or absence of 80091 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis.

[5005] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing, of an 80091-expressing cell, e.g., a hyper-proliferative 80091-expressing cell. The method includes contacting the cell with a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 80091 polypeptide or nucleic acid. In a preferred embodiment, the contacting step is effective in vitro or ex vivo.

[5006] In a preferred embodiment, the contacting step is effected in vivo in a subject, e.g., as part of a therapeutic or prophylactic protocol. Preferably, the subject is a human, e.g., a patient with a hematopoietic disorder such as an erythroid-associated disorder. For example, the subject can be a patient with an anemia, e.g., a drug-induced anemia (e.g., a chemotherapy-induced anemia), hemolytic anemia, aberrant erythropoiesis, secondary anemia in non-hematolic disorders, anemia of chronic disease such as chronic renal failure; endocrine deficiency disease; and/or erythrocytosis (e.g., polycythemia). Preferably, the erythroid-associated disorder is a drug-induced anemia (e.g., a chemotherapy induced anemia). Alternatively, the subject can be a cancer patient, e.g., a patient with leukemic cancer, e.g., an erythroid leukemia. In other embodiments, the subject is a non-human animal, e.g., an experimental animal.

[5007] In a preferred embodiment, the method further includes contacting of the erythroid cell with a protein, e.g., a hormone. The protein can be a member of the following non-limiting group: G-CSF, GM-CSF, stem cell factor, interleukin-3 (IL-3), IL-4, FIt-3 ligand, thrombopoietin, and erythropoietin. More preferably, the protein is erythropoietin. The protein contacting step can occur before, at the same time, or after the agent is contacted. The protein contacting step can be effected in vitro or ex vivo. For example, the cell, e.g., the erythroid cell can be obtained from a subject, e.g., a patient, and contacted with the protein ex vivo. The treated cell can be re-introduced into the subject. Alternatively, the protein contacting step can occur in vivo. The contacting step(s) can be repeated.

[5008] In a preferred embodiment, the agent increases the number of hematopoietic cells, e.g., erythroid cells, by e.g., increasing the proliferation, survival, and/or stimulating the differentiation, of hematopoietic (e.g., erythroid) progenitor cells, in the subject. Such agents can be used to treat an anemia, e.g., a drug- (e.g., chemotherapy-) induced anemia, hemolytic anemia, aberrant erythropoiesis, secondary anemia in non-hematolic disorder, anemia of chronic diseases such as chronic renal failure; endocrine deficiency disease; and/or erythrocytosis (e.g., polycythemias).

[5009] In a preferred embodiment, the cell, e.g., the 80091-expressing cell, is an erythroid cell, a human umbilical vein endothelial cell (HUVEC), or a brain cell.

[5010] In a preferred embodiment, the agent increases the number of erythroid cells, by e.g., increasing the proliferation, survival, and/or stimulating the differentiation, of granulocytic and monocytic progenitor cells, e.g., CFU-GM, CFU-G (colony forming unit—granulocyte), myeloblast, promyelocyte, myelocyte, a metamyelocyte, or a band cell. Such compounds can be used to treat or prevent neutropenia and granulocytopenia, e.g., conditions caused by cytotoxic chemotherapy, AIDS, congenital and cyclic neutropenia, myelodysplastic syndromes, or aplastic anemia.

[5011] In a preferred embodiment, the compound is an inhibitor of an 80091 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another preferred embodiment, the compound is an inhibitor of an 80091 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[5012] In a preferred embodiment, the compound is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[5013] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant cellular proliferation or differentiation of an 80091-expressing cell, in a subject. Preferably, the method includes administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 80091 polypeptide or nucleic acid. In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition.

[5014] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of an 80091 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of an 80091 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 80091 nucleic acid or polypeptide expression can be detected by any method described herein.

[5015] In a preferred embodiment, the disorder is a hematopoietic disorder, e.g., an erythroid-associated disorder. Examples of erythroid-associated disorders include an anemia, e.g., a drug- (e.g., chemotherapy-) induced anemia, a hemolytic anemia, aberrant erythropoiesis, secondary anemia in non-hematolic disorder, anemias of chronic disease such as chronic renal failure; endocrine deficiency diseases; and/or erythrocytosis (e.g., polycythemia).

[5016] In a preferred embodiment, the disorder is a cancer, e.g., leukemic cancer, e.g., an erythroid leukemia, or a carcinoma, e.g., a renal carcinoma.

[5017] In a preferred embodiment, the subject is a human.

[5018] In a preferred embodiment, the subject is an experimental animal, e.g., an animal model for a hematopoietic- (e.g., an erythroid-) associated disorder.

[5019] In a preferred embodiment, the method can further include treating the subject with a protein, e.g., a cytokine or a hormone. Exemplary proteins include, but are not limited to, G-CSF, GM-CSF, stem cell factor, interleukin-3 (IL-3), IL-4, FIt-3 ligand, thrombopoietin, and erythropoietin. Preferably, the protein is erythropoietin.

[5020] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of an 80091 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[5021] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 80091 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 80091 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 80091 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue or an erythroid cell tissue.

[5022] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in an 80091 polypeptide or nucleic acid molecule, including for disease diagnosis.

[5023] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes an 80091 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to an 80091 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 80091 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[5024] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 80091

[5025] The human 80091 sequence (see SEQ ID NO:94, as recited in Example 53), which is approximately 3954 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 3954 nucleotides, including the termination codon. The coding sequence encodes a 1317 amino acid protein (see SEQ ID NO:95, as recited in Example 53).

[5026] Human 80091 contains the following regions or other structural features:

[5027] one predicted ubiquitin carboxyl-terminal hydrolase-1 (UCH-1) domain (PFAM Accession PF00442) located at about amino acid 447 to about 478 of SEQ ID NO:95, which includes one predicted ubiquitin carboxyl-terminal hydrolases family 2 signature 1 from about amino acids 449 to 463 of SEQ ID NO:95;

[5028] one predicted ubiquitin carboxyl-terminal hydrolase-2 (UCH-2) domain (PFAM Accession PF00443) located at about amino acid 1219 to about 1279 of SEQ ID NO:95, which includes one predicted ubiquitin carboxyl-terminal hydrolases family 2 signature 2 from about amino acids 1223 to 1240 of SEQ ID NO:95;

[5029] ten predicted N-glycosylation sites (PS0001) located at about amino acids 168 to 171, 177 to 180, 209 to 212, 431 to 434, 459 to 462, 485 to 488, 588 to 591, 767 to 770, 1203 to 1206, and 1255 to 1258 of SEQ ID NO:95;

[5030] twenty-two predicted Protein Kinase C phosphorylation sites (PS00005) located at about amino acids 23 to 25, 56 to 58, 360 to 362, 369 to 371, 432 to 434, 442 to 444, 477 to 479, 511 to 513, 613 to 615, 764 to 766, 806 to 808, 826 to 828, 869 to 871, 938 to 940, 979 to 981, 1008 to 1010, 1085 to 1087, 1092 to 1094, 1102 to 1104, 1111 to 1113, 1135 to 1137, and 1258 to 1260 of SEQ ID NO:95;

[5031] twenty-four predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino acids 10 to 13, 48 to 51, 83 to 86, 169 to 172, 180 to 183, 301 to 304, 360 to 363, 391 to 394, 421 to 424, 433 to 436, 723 to 726, 732 to 735, 741 to 744, 779 to 782, 923 to 926, 956 to 959, 1063 to 1066, 1135 to 1138, 1167 to 1170, 1173 to 1176, 1205 to 1208, 1209 to 1212, 1258 to 1261, and 1300 to 1303 of SEQ ID NO:95; two predicted Tyrosine kinase phosphorylation sites (PS00007) located at about amino acids 556 to 562, and 652 to 659 of SEQ ID NO:95;

[5032] twenty-one predicted N-myristylation sites (PS00008) located at about amino acids 41 to 46, 78 to 83, 102 to 107, 141 to 146, 161 to 166, 199 to 204, 220 to 225, 235 to 240, 288 to 293, 510 to 515, 598 to 603, 650 to 655, 754 to 759, 763 to 768, 771 to 776, 877 to 882, 1088 to 1093, 1193 to 1198, 1199 to 1204, 1233 to 1238, and 1281 to 1286 of SEQ ID NO:95;

[5033] one predicted amidation site (PS00009) located at about amino acids 1292 to 1295 of SEQ ID NO:95; and

[5034] one predicted Prenyl group binding site (PS00266) located at about amino acids 1314 to 1317 of SEQ ID NO:95;

[5035] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[5036] A plasmid containing the nucleotide sequence encoding human 80091 (clone “Fbh80091 FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[5037] The 80091 protein contains a significant number of structural characteristics in common with members of the ubiquitin carboxy-terminal hydrolase (“UCH”) family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[5038] The UCH family consists of large proteins (about 800 to 2000 residues) that share two conserved regions, a UCH-1 domain and a UCH-2 domain, each of which is thought to participate in the catalytic mechanism. The conserved signature patterns of UCH-1 and UCH-2 are respectively as follows: (1) G-[LIVMFY]-x(1,3)-[AGC]-[NASM]-x-C-[FYW]-[LIVMFC]-[NST]-[SACV]-x-[LIVMS]-Q; and (2) Y-x-L-x-[SAG]-[LIVMFT]-x(2)-H-x-G-x(4,5)-G-H-Y (SEQ ID NO:98). An 80091 protein typically contains one or more sequences that conform to each of the signature patterns. For example, an 80091 protein can contain the sequence LSNLGNTCFMNSSIQ (SEQ ID NO:99), e.g., located at amino acids 449 to 463 of SEQ ID NO:95, which corresponds to the UCH-1 signature pattern. An 80091 protein can also include the sequence YNLYAISCHSGILGGGHY (SEQ ID NO:100), e.g., located at amino acids 1223 to 1240 of SEQ ID NO:95, which corresponds to the UCH-2 signature pattern.

[5039] An 80091 polypeptide can include at least one, and preferably two ubiquitin carboxy-terminal hydrolase domains (UCH domains) or regions homologous to a UCH domain. As used herein, the term “ubiquitin carboxyl-terminal hydrolase domain” refers to an amino acid sequence that participates in the removal of one or more ubiquitin molecules from a protein that has one or more molecules of ubiquitin attached to it. The definition also includes cleavage of conjugated forms of ubiquitin (e.g., in a head to tail orientation linked via a peptide bond) whether or not the ubiquitin conjugate is attached to a protein. For example, an ubiquitin-ubiquitin conjugate (dimer) could be cleaved into monomers, a tri-ubiquitin conjugate could be cleaved into three monomers, or a dimer and a single monomer. In either of these particular examples, the monomer or dimer could remain attached to or be cleaved from the ubiquitinated protein. An 80091 polypeptide can include at least one and preferably two UCH domains, referred to individually herein as “UCH-1 domain” and “UCH-2 domain.”

[5040] As used herein, the term “UCH-1 domain” includes an amino acid sequence of about 10 to 100 amino acid residues in length and having a bit score for the alignment of the sequence to the UCH-1 domain (HMM) of at least 25. Preferably, a UCH-1 domain includes at least about 20 to 50 amino acids, more preferably about 25 to 40 amino acid residues, or about 30 to 35 amino acids and has a bit score for the alignment of the sequence to the UCH-1 domain (HMM) of at least 35, 40, 55 or greater. The UCH-1 domain (HMM) has been assigned the PFAM Accession Number PF00442 (http;//genome.wustl.edu/Pfam/.html). In one embodiment, a UCH-1 domain includes the amino acid sequence LSNLGNTCFMNSSIQ (SEQ ID NO:99), wherein C is the putative active site residue. An alignment of the UCH-1 domain (amino acids 447 to 478 of SEQ ID NO:95) of human 80091 with a consensus amino acid sequence (SEQ ID NO:96) derived from a hidden Markov model is depicted in FIG. 41A.

[5041] In a preferred embodiment, an 80091 polypeptide or protein has a “UCH-1 domain” or a region which includes at least about 20 to 50 more preferably about 25 to 40 or 30 to 35 amino acid residues and has at least about 70% 80% 90% 95%, 99%, or 100% homology with a “UCH-1 domain,” e.g., a UCH-1 domain of human 80091, e.g., residues 447 to 478 of SEQ ID NO:95.

[5042] An 80091 polypeptide can include a “UCH-2 domain” or regions homologous with a “UCH-2 domain.”

[5043] As used herein, the term “UCH-2 domain” includes an amino acid sequence of about 10 to 150 amino acid residues in length and having a bit score for the alignment of the sequence to the UCH-2 domain (HMM) of at least 50. Preferably, a UCH-2 domain includes at least about 40 to 120 amino acids, more preferably about 50 to 100 amino acid residues, or about 60 to 70 amino acids and has a bit score for the alignment of the sequence to the UCH-2 domain (HMM) of at least 60, 70, 80, 90 or greater. The UCH-2 domain (HMM) has been assigned the PFAM Accession Number PF00443 (http;//genome.wustl.edu/Pfam/.html). In one embodiment, a UCH-2 domain includes the amino acid sequence YNLYAISCHSGILGGGHY (SEQ ID NO:100), wherein the histidine residues are two putative active site residues. An alignment of the UCH-2 domain (amino acids 1219 to 1279 of SEQ ID NO:95) of human 80091 with a consensus amino acid sequence (SEQ ID NO:97) derived from a hidden Markov model is depicted in FIG. 41B.

[5044] In a preferred embodiment, an 80091 polypeptide or protein has a “UCH-2 domain” or a region which includes at least about 40 to 120 more preferably about 50 to 100 or 60 to 70 amino acid residues and has at least about 70% 80% 90% 95%, 99%, or 100% homology with a “UCH-2 domain,” e.g., a UCH-2 domain of human 80091, e.g., residues 1219 to 1279 of SEQ ID NO:95.

[5045] To identify the presence of a “UCH (ubiquitin carboxyl-terminal hydrolase)” domain, e.g., a UCH-1 or a UCH-2 domain, in an 80091 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183: 146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84: 4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2: 305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of a “ubiquitin carboxy-terminal hydrolase” domain in the amino acid sequence of human 80091 at about residues 447 to 478 of SEQ ID NO:95 (UCH-1) and 1219 to 1279 of SEQ ID NO:95 (UCH-2) (see FIGS. 41A and 41B).

[5046] An 80091 family memeber can include at least one UCH-1 domain and at least one UCH-2 domain.

[5047] An 80091 polypeptide can further include at least one, two, three, four, five, six, seven, eight, nine, preferably ten N-glycosylation sites; at least one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty, twenty-one, preferably twenty-two protein kinase C phosphorylation sites; at least one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty, twenty-one, twenty-two, twenty-three, preferably twenty-four casein kinase II phosphorylation sites; at least one, preferably two tyrosine kinase phosphorylation sites; at least one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty, preferably twenty-one N-myristylation sites; and at least one amidation site.

[5048] Members of the ubiquitin carboxyl-terminal hydrolase family of proteins are characterized by UCH domains, described above, and can be tested for de-ubiquination activity using assays know in the art. De-ubiquitination assays useful for detecting a ubiquitin carboxyl-terminal hydrolase activity are known in the art and can be found, for example, in Zhu et al. (1997) Journal of Biological Chemistry 272: 51-57, Mitch et al. (1999) American Journal of Physiology 276: C1132-C1138, Liu et al. (1999) Molecular and Cell Biology 19: 3029-3038, and such as those cited in various reviews, for example, Ciechanover et al. (1994) The FASEB Journal 8: 182-192, Chiechanover (1994) Biol. Chem. Hoppe-Seyler 375: 565-581, Hershko et al. (1998) Annual Review of Biochemistry 67: 425-479, Swartz (1999) Annual Review of Medicine 50: 57-74, Ciechanover (1998) EMBO Journal 17: 7151-7160, and D'Andrea et al. (1998) Critical Reviews in Biochemistry and Molecular Biology 33: 337-352. These assays include, but are not limited to, the disappearance of substrate, including a decrease in the amount of polyubiquitin or ubiquitinated substrate protein or protein remnant, appearance of intermediate and end products, such as appearance of free ubiquitin monomers, general protein turnover, specific protein turnover, ubiquitin binding, binding to ubiquitinated substrate protein, subunit interaction, interaction with ATP, interaction with cellular components such as trans-acting regulatory factors, stabilization of specific proteins, and the like. As the 80091 polypeptides of the invention may modulate 80091-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 80091-mediated or related disorders, as described below.

[5049] As used herein, an “80091 activity”, “biological activity of 80091” or “functional activity of 80091”, refers to an activity exerted by an 80091 protein, polypeptide or nucleic acid molecule. For example, an 80091 activity can be an activity exerted by 80091 in a physiological milieu on, e.g., an 80091-responsive cell or on an 80091 substrate, e.g., a protein substrate. An 80091 activity can be determined in vivo or in vitro. In one embodiment, an 80091 activity is a direct activity, such as an association with an 80091 target molecule. A “target molecule” or “binding partner” is a molecule with which an 80091 protein binds or interacts in nature. In an exemplary embodiment, 80091 is an enzyme that mediates a de-ubiquitination reaction.

[5050] An 80091 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 80091 protein with an 80091 receptor. The features of the 80091 molecules of the present invention can provide similar biological activities as ubiquitin carboxy-terminal hydrolase family members. For example, the 80091 proteins of the present invention can have one or more of the following activities: (1) modulation of de-ubiquitination of a substrate, e.g., a ubiquitinated protein targeted for degradation; (2) participation in the processing of poly-ubiquitin precursors; (3) modulation of cellular proliferation and/or differentiation; (4) modulation (e.g., stimulation) of cell differentiation, e.g., differentiation of hematopoietic cells (e.g., differentiation of blood cells (e.g., erythroid progenitor cells, such as CD34+ erythroid progenitors)); (5) modulation of hematopoiesis, e.g., erythropoiesis; (6) modulation of cell proliferation, e.g., proliferation of hematopoietic cells (e.g., erythroid progenitor cells); (7) modulation (increase or decrease) of apoptosis, e.g., apoptosis of a cancer cell, e.g., a leukemic cell, (e.g., an erythroleukemia cell); (8) modulation of transcription and/or cell-cycle progression; (9) modulation of signal transduction; (10) modulation of antigen processing; (11) modulation of cell-cell adhesion; (12) modulation of receptor-mediated endocytosis; (13) modulation of organelle biogenesis and development; (14) participation in neural development and/or maintenance; (15) participation in neuropathological conditions; (16) participation in oncogenesis; (17) modulation of immune function; (18) modulation of metabolism; or (19) regulation of gamete function.

[5051] Based upon the above-described sequence similarities and the detetced expression patterns of 80091 described in Table 28 of Example 53 (e.g., erythroid cells, neural tissues, and HUVEC), the 80091 molecules of the present invention are predicted to have similar biological activities as ubiquitin carboxy-terminal hydrolase family members. Ubiquitin carboxy-terminal hydrolase domains regulate the de-ubiquitination of a substrate, e.g., a protein targeted for degradation. Thus, 80091 molecules can act as novel diagnostic targets and therapeutic agents for controlling, e.g., ubiquitination related disorders. 80091 molecules of the invention may be useful, for example, in inducing the de-ubiquitination of ubiquitinated proteins. These proteins can therefore modulate protein degradation and the recycling of ubiquitin, as well as participate in cell signaling pathways in which ubiquitination or de-ubiquitination of a protein can alter or modify the activity of the protein. Thus, 80091 molecules may act as novel therapeutic agents for controlling disorders associated with excessive or insufficient ubiquitination (e.g., protein degradation), and as diagnostic markers useful for indicating the presence or predisposition towards developing such disorders, or monitoring the progression or regression of a disorder.

[5052] The 80091 molecules can act as novel diagnostic targets and therapeutic agents for controlling disorders associated with abnormal de-ubiquitination activity and disorders associated with abnormal protein degradation. Additional examples of disorders that can be treated and/or diagnosed with the molecules of the invention include hematopoietic disorders such as erythroid cell-associated disorders, cellular proliferative and/or differentiative disorders, neurological or brain disorders, metabolic disorders, angiogenic disorders, and endothelial cell disorders.

[5053] As used herein, the term “erythroid cell-associated disorders” includes disorders involving aberrant (increased or deficient) erythroblast proliferation, e.g., an erythroleukemia, and aberrant (increased or deficient) erythroblast differentiation, e.g., an anemia. Erythrocyte-associated disorders include anemias such as, for example, drug-(chemotherapy-) induced anemias, hemolytic anemias due to hereditary cell membrane abnormalities, such as hereditary spherocytosis, hereditary elliptocytosis, and hereditary pyropoikilocytosis; hemolytic anemias due to acquired cell membrane defects, such as paroxysmal nocturnal hemoglobinuria and spur cell anemia; hemolytic anemias caused by antibody reactions, for example to the RBC antigens, or antigens of the ABO system, Lewis system, Ii system, Rh system, Kidd system, Duffy system, and Kell system; methemoglobinemia; a failure of erythropoiesis, for example, as a result of aplastic anemia, pure red cell aplasia, myelodysplastic syndromes, sideroblastic anemias, and congenital dyserythropoietic anemia; secondary anemia in non-hematolic disorders, for example, as a result of chemotherapy, alcoholism, or liver disease; anemia of chronic disease, such as chronic renal failure; and endocrine deficiency diseases.

[5054] Agents that modulate 80091 polypeptide or nucleic acid activity or expression can be used to treat anemias, in particular, drug-induced anemias or anemias associated with cancer chemotherapy, chronic renal failure, malignancies, adult and juvenile rheumatoid arthritis, disorders of hemoglobin synthesis, prematurity, and zidovudine treatment of HIV infection. A subject receiving the treatment can be additionally treated with a second agent, e.g., erythropoietin, to further ameliorate the condition.

[5055] As used herein, the term “erythropoietin” or “EPO” refers to a glycoprotein produced in the kidney, which is the principal hormone responsible for stimulating red blood cell production (erythrogenesis). EPO stimulates the division and differentiation of committed erythroid progenitors in the bone marrow. Normal plasma erythropoietin levels range from 0.01 to 0.03 Units/mL, and can increase up to 100 to 1,000-fold during hypoxia or anemia. Graber and Krantz, Ann. Rev. Med. 29: 51 (1978); Eschbach and Adamson, Kidney Intl. 28:1 (1985). Recombinant human erythropoietin (rHuEpo or epoietin alpha) is commercially available as EPOGEN.RTM. (epoietin alpha, recombinant human erythropoietin) (Amgen Inc., Thousand Oaks, Calif.) and as PROCRIT.RTM. (epoietin alpha, recombinant human erythropoietin) (Ortho Biotech Inc., Raritan, N.J.).

[5056] Another example of an erythroid cell-associated disorder is erythrocytosis. Erythrocytosis, a disorder of red blood cell overproduction caused by excessive and/or ectopic erythropoietin production, can be caused by cancers, e.g., a renal cell cancer, a hepatocarcinoma, and a central nervous system cancer. Diseases associated with erythrocytosis include polycythemias, e.g., polycythemia vera, secondary polycythemia, and relative polycythemia.

[5057] Aberrant expression or activity of the 80091 molecules may be involved in neoplastic disorders. Accordingly, treatment, prevention and diagnosis of cancer or neoplastic disorders related to hematopoietic cells and, in particular, cells of the erythroid lineage are also included in the present invention. Such neoplastic disorders are exemplified by erythroid leukemias, or leukemias of erythroid precursor cells, e.g., poorly differentiated acute leukemias such as erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11: 267-97). In particular, AML can include the uncontrolled proliferation of CD34+ cells such as AML subtypes M1 and M2, myeloblastic leukemias with and without maturation, and AML subtype M6, erythroleukemia (Di Guglielmo's disease). Additional neoplastic disorders include a myelodysplastic syndrome or preleukemic disorder, e.g., oligoblastic leukemia, smoldering leukemia. Additional cancers of the erythroid lineage include erythroblastosis, and other relevant diseases of the bone marrow.

[5058] The term “leukemia” or “leukemic cancer” is intended to have its clinical meaning, namely, a neoplastic disease in which white corpuscle maturation is arrested at a primitive stage of cell development. The disease is characterized by an increased number of leukemic blast cells in the bone marrow, and by varying degrees of failure to produce normal hematopoietic cells. The condition may be either acute or chronic. Leukemias are further typically categorized as being either lymphocytic i.e., being characterized by cells which have properties in common with normal lymphocytes, or myelocytic (or myelogenous), i.e., characterized by cells having some characteristics of normal granulocytic cells. Acute lymphocytic leukemia (“ALL”) arises in lymphoid tissue, and ordinarily first manifests its presence in bone marrow. Acute myelocytic leukemia (“AML”) arises from bone marrow hematopoietic stem cells or their progeny. The term acute myelocytic leukemia subsumes several subtypes of leukemia: myeloblastic leukemia, promyelocytic leukemia, and myelomonocytic leukemia. In addition, leukemias with erythroid or megakaryocytic properties are considered myelogenous leukemias as well.

[5059] The molecules of the invention may also modulate the activity of neoplastic, non-hematopoietic tissues in which they are expressed, e.g., kidney, lung, liver, skeletal muscle. For example, increase expression of 80091 molecules is detected on lung tumors compared to the normal lung. Accordingly, the 80091 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders. Examples of such cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, or metastatic disorders. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of lung, prostate, colon, breast, and liver origin.

[5060] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[5061] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth. Examples of such cells include cells having an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[5062] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[5063] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[5064] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[5065] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin, e.g., arising from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[5066] The 80091 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of neurological or brain disorders. Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B₁) deficiency and vitamin B₁₂ deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[5067] Additionally, 80091 may play an important role in the regulation of metabolism. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes.

[5068] An “angiogenic disorder” refers to a disorder characterized by aberrant, unregulated, or unwanted vascularization. Angiogenic disorders include, but are not limited to, hemangiomas, Kaposi's sarcoma, von Hippel-Lindau disease; psoriasis; diabetic retinopathy; endometriosis; Grave's disease; chronic inflammatory diseases (e.g., rheumatoid arthritis); aberrant or excess angiogenesis in diseases such as a Castleman's disease or fibrodysplasia ossificans progressiva; aberrant or deficient angiogenesis associated with aging, complications of healing certain wounds and complications of diseases such as diabetes and rheumatoid arthritis; or aberrant or deficient angiogenesis associated with hereditary hemorrhagic telangiectasia, autosomal dominant polycystic kidney disease, myelodysplastic syndrome or Klippel-Trenaunay-Weber syndrome.

[5069] As used herein, an “endothelial cell disorder” includes a disorder characterized by aberrant, unregulated, or unwanted endothelial cell activity, e.g., proliferation, migration, angiogenesis, or vascularization; or aberrant expression of cell surface adhesion molecules or genes associated with angiogenesis, e.g., TIE-2, FLT and FLK. Endothelial cell disorders include, but are not limited to, responses of vascular cell walls to injury, such as endothelial dysfunction and endothelial activation and intimal thickening; vascular diseases including, but not limited to, congenital anomalies, such as arteriovenous fistula, vasculitides, such as giant cell (temporal) arteritis, Takayasu arteritis, polyarteritis nodosa (classic), Kawasaki syndrome (mucocutaneous lymph node syndrome), microscopic polyanglitis (microscopic polyarteritis, hypersensitivity or leukocytoclastic anglitis), Wegener granulomatosis, thromboanglitis obliterans (Buerger disease), vasculitis associated with other disorders, and infectious arteritis; Raynaud disease; aneurysms and dissection, such as abdominal aortic aneurysms, syphilitic (luetic) aneurysms, and aortic dissection (dissecting hematoma); disorders of veins and lymphatics, such as varicose veins, thrombophlebitis and phlebothrombosis, obstruction of superior vena cava (superior vena cava syndrome), obstruction of inferior vena cava (inferior vena cava syndrome), and lymphangitis and lymphedema; tumors, including benign tumors and tumor-like conditions, such as hemangioma, lymphangioma, glomus tumor (glomangioma), vascular ectasias, and bacillary angiomatosis, and intermediate-grade (borderline low-grade malignant) tumors, such as Kaposi sarcoma, described above, and hemangloendothelioma, and malignant tumors, such as angiosarcoma and hemangiopericytoma; and pathology of therapeutic interventions in vascular disease, such as balloon angioplasty and related techniques and vascular replacement, such as coronary artery bypass graft surgery.

[5070] The 80091 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:95 thereof are collectively referred to as “polypeptides or proteins of the invention” or “80091 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “80091 nucleic acids.” 80091 molecules refer to 80091 nucleic acids, polypeptides, and antibodies.

[5071] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[5072] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[5073] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[5074] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO:94, corresponds to a naturally-occurring nucleic acid molecule.

[5075] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein. As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding an 80091 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 80091 protein or derivative thereof.

[5076] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 80091 protein is at least 10% pure. In a preferred embodiment, the preparation of 80091 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-80091 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-80091 chemicals. When the 80091 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[5077] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 80091 without abolishing or substantially altering an 80091 activity. Preferably the alteration does not substantially alter the 80091 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 80091, results in abolishing an 80091 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 80091 are predicted to be particularly unamenable to alteration.

[5078] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in an 80091 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of an 80091 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 80091 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:94, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[5079] As used herein, a “biologically active portion” of an 80091 protein includes a fragment of an 80091 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between an 80091 molecule and a non-80091 molecule or between a first 80091 molecule and a second 80091 molecule (e.g., a dimerization interaction). Biologically active portions of an 80091 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 80091 protein, e.g., the amino acid sequence shown in SEQ ID NO:95, which include less amino acids than the full length 80091 proteins, and exhibit at least one activity of an 80091 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 80091 protein, e.g., (1) modulation of de-ubiquitination of a substrate, e.g., a ubiquitinated protein targeted for degradation; (2) participation in the processing of poly-ubiquitin precursors; (3) modulation of cellular proliferation and/or differentiation; (4) modulation (e.g., stimulation) of cell differentiation, e.g., differentiation of hematopoietic cells (e.g., differentiation of blood cells (e.g., erythroid progenitor cells, such as CD34+ erythroid progenitors)); (5) modulation of hematopoiesis, e.g., erythropoiesis; (6) modulation of cell proliferation, e.g., proliferation of hematopoietic cells (e.g., erythroid progenitor cells); (7) modulation of apoptosis, of a cell, e.g., increase apoptosis of a cancer cell, e.g., a leukemic cell, (e.g., an erythroleukemia cell); or suppress apoptosis of a blood or erythroid cell; (8) modulation of apoptosis; (9) modulation of transcription and/or cell-cycle progression; or (10) modulation of signal transduction. A biologically active portion of an 80091 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of an 80091 protein can be used as targets for developing agents which modulate an 80091 mediated activity, e.g., modulation of de-ubiquitination of a substrate.

[5080] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[5081] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[5082] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[5083] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48: 444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[5084] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4: 11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[5085] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215: 403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 80091 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 80091 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25: 3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[5086] Particularly preferred 80091 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:95. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:95 are termed substantially identical.

[5087] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:94 are termed substantially identical.

[5088] “Misexpression or aberrant expression,” as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[5089] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[5090] A “purified preparation of cells,” as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[5091] Various aspects of the invention are described in further detail below.

[5092] Isolated Nucleic Acid Molecules of 80091

[5093] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes an 80091 polypeptide described herein, e.g., a full-length 80091 protein or a fragment thereof, e.g., a biologically active portion of 80091 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 80091 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[5094] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:94, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 80091 protein (i.e., “the coding region” of SEQ ID NO:94), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:94 (e.g., SEQ ID NO:94) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein from about amino acids 447 to 478 and 1219 to 1279 of SEQ ID NO:95

[5095] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:94, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:94, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:94, thereby forming a stable duplex.

[5096] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about: 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:94, or a portion, preferably of the same length, of any of these nucleotide sequences.

[5097] 80091 Nucleic Acid Fragments

[5098] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:94. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of an 80091 protein, e.g., an immunogenic or biologically active portion of an 80091 protein. A fragment can comprise those nucleotides of SEQ ID NO:94, which encode an ubiquitin carboxy-terminal hydrolase domain of human 80091. The nucleotide sequence determined from the cloning of the 80091 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 80091 family members, or fragments thereof, as well as 80091 homologues, or fragments thereof, from other species.

[5099] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 50 amino acids in length, e.g., residues from 447 to 478, 1010 to 1018, and 1219 to 1279. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[5100] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, an 80091 nucleic acid fragment can include a sequence corresponding to an ubiquitin carboxy-terminal hydrolase domain.

[5101] 80091 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:94, or of a naturally occurring allelic variant or mutant of SEQ ID NO:94. Preferably, an oligonucleotide is less than about 200, 150, 120, or 100 nucleotides in length.

[5102] In one embodiment, the probe or primer is attached to a solid support, e.g., a solid support described herein.

[5103] One exemplary kit of primers includes a forward primer that anneals to the coding strand and a reverse primer that anneals to the non-coding strand. The forward primer can anneal to the start codon, e.g., the nucleic acid sequence encoding amino acid residue 1 of SEQ ID NO:95. The reverse primer can anneal to the ultimate codon, e.g., the codon immediately before the stop codon, e.g., the codon encoding amino acid residue 1317 of SEQ ID NO:95. In a preferred embodiment, the annealing temperatures of the forward and reverse primers differ by no more than 5, 4, 3, or 2° C.

[5104] In a preferred embodiment the nucleic acid is a probe which is at least 10, 12, 15, 18, 20 and less than 200, more preferably less than 100, or less than 50, nucleotides in length. It should be identical, or differ by 1, or 2, or less than 5 or 10 nucleotides, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[5105] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes: a UCH-1 domain or a UCH-2 domain.

[5106] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of an 80091 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a UCH-1 domain and a UCH-2 domian from about amino acids 447 to 478 of SEQ ID NO:95; and amino acids 1219 to 1279 of SEQ ID NO:95, respectively.

[5107] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[5108] A nucleic acid fragment encoding a “biologically active portion of an 80091 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:94, which encodes a polypeptide having an 80091 biological activity (e.g., the biological activities of the 80091 proteins are described herein), expressing the encoded portion of the 80091 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 80091 protein. For example, a nucleic acid fragment encoding a biologically active portion of 80091 includes UCH domain, e.g., amino acid residues about 447 to 478 and 1219 to 1279 of SEQ ID NO:95. A nucleic acid fragment encoding a biologically active portion of an 80091 polypeptide, may comprise a nucleotide sequence which is greater than 300 or more nucleotides in length.

[5109] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600, 2700, 2800, 2900, 3000, 3100, 3200, 3300, 3400, 3500, 3600, 3700, 3800, 3900 or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:94.

[5110] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of Genbank™ accession number AF155116, A83858, 176205, X63547, X63546, AJ012755, AC022596, AU131748, AU120381, AU117329, BE910371, AL043344, AV703768, AV706483, P35125, or a sequence disclosed in WO 01/70978, WO 01/75067, or WO 01/55318. Differences can include differing in length or sequence identity. For example, a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:94 located outside the region of nucleotides 1-743 or 3228-3954 of SEQ ID NO:94; not include all of the nucleotides of Genbank™ accession number AF155116, A83858, 176205, X63547, X63546, AJ012755, AC022596, AU131748, AU120381, AU117329, BE910371, AL043344, AV703768, AV706483, P35125, or a sequence disclosed in WO 01/70978, WO 01/75067, or WO 01/55318, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of Genbank™ accession number AF155116, A83858, 176205, X63547, X63546, AJ012755, AC022596, AU131748, AU120381, AU117329, BE910371, AL043344, AV703768, AV706483, P35125, or a sequence disclosed in WO 01/70978, WO 01/75067, or WO 01/55318; or can differ by one or more nucleotides in the region of overlap.

[5111] 80091 Nucleic Acid Variants

[5112] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:94. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 80091 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:95. If alignment is needed for this comparison the sequences should be aligned for maximum homology. The encoded protein can differ by no more than 5, 4, 3, 2, or 1 amino acid. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[5113] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[5114] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[5115] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:94, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. The nucleic acid can differ by no more than 5, 4, 3, 2, or 1 nucleotide. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[5116] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO:95 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO:95 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 80091 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 80091 gene.

[5117] Preferred variants include those that are correlated with modulation of de-ubiquitination of a substrate.

[5118] Allelic variants of 80091, e.g., human 80091, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 80091 protein within a population that maintain the ability to modulate de-ubiquitination of a substrate. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:95, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 80091, e.g., human 80091, protein within a population that do not have the ability to: (1) modulate de-ubiquitination of a substrate, e.g., a ubiquitinated protein targeted for degradation; (2) participate in the processing of poly-ubiquitin precursors; (3) modulate cellular proliferation and/or differentiation; (4) modulate (e.g., stimulate) of cell differentiation, e.g., differentiation of hematopoietic cells (e.g., differentiation of blood cells (e.g., erythroid progenitor cells, such as CD34+ erythroid progenitors)); (5) modulate hematopoiesis, e.g., erythropoiesis; (6) modulate cell proliferation, e.g., proliferation of hematopoietic cells (e.g., erythroid progenitor cells); (7) modulate apoptosis, of a cell, e.g., increase apoptosis of a cancer cell, e.g., a leukemic cell, (e.g., an erythroleukemia cell); or suppress apoptosis of a blood or erythroid cell; (8) modulate apoptosis; (9) modulate transcription and/or cell-cycle progression; or (10) modulate signal transduction. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:95, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[5119] Moreover, nucleic acid molecules encoding other 80091 family members and, thus, which have a nucleotide sequence which differs from the 80091 sequences of SEQ ID NO:94 are intended to be within the scope of the invention.

[5120] Antisense Nucleic Acid Molecules, Ribozymes and Modified 80091 Nucleic Acid Molecules

[5121] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 80091. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 80091 coding strand, or to only a portion thereof (e.g., the coding region of human 80091 corresponding to SEQ ID NO:94). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 80091 (e.g., the 5′ and 3′untranslated regions).

[5122] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 80091 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 80091 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 80091 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[5123] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[5124] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding an 80091 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[5125] In yet another embodiment, the antisense nucleic acid molecule of the invention is an α-anomeric nucleic acid molecule. An α-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual β-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[5126] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for an 80091-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of an 80091 cDNA disclosed herein (i.e., SEQ ID NO:94), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in an 80091-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 80091 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[5127] 80091 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 80091 (e.g., the 80091 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 80091 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[5128] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or colorimetric.

[5129] AN 80091 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19: 17 and Faria et al. (2001) Nature Biotech. 19: 40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[5130] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[5131] PNAs of 80091 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 80091 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[5132] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86: 6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84: 648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6: 958-976) or intercalating agents (see, e.g., Zon (1988) Pharm. Res. 5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[5133] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to an 80091 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 80091 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[5134] Isolated 80091 Polypeptides

[5135] In another aspect, the invention features, an isolated 80091 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-80091 antibodies. 80091 protein can be isolated from cells or tissue sources using standard protein purification techniques. 80091 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[5136] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[5137] In a preferred embodiment, an 80091 polypeptide has one or more of the following characteristics:

[5138] (i) it has the ability to modulate de-ubiquitination of a substrate, e.g., a ubiquitinated protein targeted for degradation;

[5139] (ii) it has the ability to participate in the processing of poly-ubiquitin precursors;

[5140] (iii) it has the ability to modulate cellular proliferation and/or differentiation;

[5141] (iv) it has the ability to modulate (e.g., stimulate) of cell differentiation, e.g., differentiation of hematopoietic cells (e.g., differentiation of blood cells (e.g., erythroid progenitor cells, such as CD34+erythroid progenitors));

[5142] (v) it has the ability to modulate hematopoiesis, e.g., erythropoiesis;

[5143] (vi) it has the ability to modulate cell proliferation, e.g., proliferation of hematopoietic cells (e.g., erythroid progenitor cells);

[5144] (vii) it has the ability to modulate apoptosis, of a cell, e.g., increase apoptosis of a cancer cell, e.g., a leukemic cell, (e.g., an erythroleukemia cell); or suppress apoptosis of a blood or erythroid cell;

[5145] (viii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of SEQ ID NO:95;

[5146] (ix) it has an overall sequence similarity of at least 65%, preferably at least 70%, more preferably at least 80, 90, or 95%, with a polypeptide a of SEQ ID NO:95;

[5147] (x) it has a UCH-1 domain which is preferably about 70%, 80%, 90% or 95% identical with amino acid residues about 447 to 478 of SEQ ID NO:95; and

[5148] (xi) it has a UCH-2 domain which is preferably about 70%, 80%, 90% or 95% identical with amino acid residues about 1219 to 1279 of SEQ ID NO:95.

[5149] In a preferred embodiment the 80091 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:95. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:95 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:95. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the UCH. In another preferred embodiment one or more differences are in the UCH domains (amino acids 447 to 478 and 1219 to 1279 of SEQ ID NO:95).

[5150] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 80091 proteins differ in amino acid sequence from SEQ ID NO:95, yet retain biological activity.

[5151] In one embodiment, the protein includes an amino acid sequence at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:95.

[5152] An 80091 protein or fragment is provided which varies from the sequence of SEQ ID NO:95 in regions defined by amino acids about 1 to 446 and 479 to 1009 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:95 in regions defined by amino acids about 447 to 478 and 1219 to 1279. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[5153] In one embodiment, a biologically active portion of an 80091 protein includes a UCH-1 domain and a UCH-2 domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 80091 protein.

[5154] In a preferred embodiment, the 80091 protein has an amino acid sequence shown in SEQ ID NO:95. In other embodiments, the 80091 protein is substantially identical to SEQ ID NO:95. In yet another embodiment, the 80091 protein is substantially identical to SEQ ID NO:95 and retains the functional activity of the protein of SEQ ID NO:95, as described in detail in the subsections above.

[5155] In a preferred embodiment, a fragment differs by at least 1, 2, 3, 10, 20, or more amino acid residues encoded by a nucleotide sequence present in Genbank™ accession number AF155116, A83858, 176205, X63547, X63546, AJ012755, AC022596, AU131748, AU120381, AU117329, BE910371, AL043344, AV703768, AV706483, P35125, or a sequence disclosed in WO 01/70978, WO 01/75067, or WO 01/55318. Differences can include differing in length or sequence identity. For example, a fragment can: include one or more amino acid residues from SEQ ID NO:95 outside the region encoded by nucleotides 1-743 or 3228-3954 of SEQ ID NO:94; not include all of the amino acid residues encoded by a nucleotide sequence in Genbank™ accession number AF155116, A83858, 176205, X63547, X63546, AJ012755, AC022596, AU131748, AU120381, AU117329, BE910371, AL043344, AV703768, AV706483, P35125, or a sequence disclosed in WO 01/70978, WO 01/75067, or WO 01/55318, e.g., can be one or more amino acid residues shorter (at one or both ends) than a sequence encoded by the nucleotide sequence in Genbank™ accession number AF155116, A83858, 176205, X63547, X63546, AJ012755, AC022596, AU131748, AU120381, AU117329, BE910371, AL043344, AV703768, AV706483, P35125, or a sequence disclosed in WO 01/70978, WO 01/75067, or WO 01/55318; or can differ by one or more amino acid residues in the region of overlap.

[5156] 80091 Chimeric or Fusion Proteins

[5157] In another aspect, the invention provides 80091 chimeric or fusion proteins. As used herein, an 80091 “chimeric protein” or “fusion protein” includes an 80091 polypeptide linked to a non-80091 polypeptide. A “non-80091 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 80091 protein, e.g., a protein which is different from the 80091 protein and which is derived from the same or a different organism. The 80091 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of an 80091 amino acid sequence. In a preferred embodiment, an 80091 fusion protein includes at least one (or two) biologically active portion of an 80091 protein. The non-80091 polypeptide can be fused to the N-terminus or C-terminus of the 80091 polypeptide.

[5158] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-80091 fusion protein in which the 80091 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 80091. Alternatively, the fusion protein can be an 80091 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 80091 can be increased through use of a heterologous signal sequence.

[5159] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[5160] The 80091 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 80091 fusion proteins can be used to affect the bioavailability of an 80091 substrate. 80091 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding an 80091 protein; (ii) mis-regulation of the 80091 gene; and (iii) aberrant post-translational modification of an 80091 protein.

[5161] Moreover, the 80091-fusion proteins of the invention can be used as immunogens to produce anti-80091 antibodies in a subject, to purify 80091 ligands and in screening assays to identify molecules which inhibit the interaction of 80091 with an 80091 substrate.

[5162] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). AN 80091-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 80091 protein.

[5163] Variants of 80091 Proteins

[5164] In another aspect, the invention also features a variant of an 80091 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 80091 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of an 80091 protein. An agonist of the 80091 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of an 80091 protein. An antagonist of an 80091 protein can inhibit one or more of the activities of the naturally occurring form of the 80091 protein by, for example, competitively modulating an 80091-mediated activity of an 80091 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 80091 protein.

[5165] Variants of an 80091 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of an 80091 protein for agonist or antagonist activity.

[5166] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of an 80091 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of an 80091 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[5167] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 80091 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 80091 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[5168] Cell based assays can be exploited to analyze a variegated 80091 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 80091 in a substrate-dependent manner. The transfected cells are then contacted with 80091 and the effect of the expression of the mutant on signaling by the 80091 substrate can be detected, e.g., by measuring de-ubiquitination of a substrate. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 80091 substrate, and the individual clones further characterized.

[5169] In another aspect, the invention features a method of making an 80091 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 80091 polypeptide, e.g., a naturally occurring 80091 polypeptide. The method includes: altering the sequence of an 80091 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[5170] In another aspect, the invention features a method of making a fragment or analog of an 80091 polypeptide a biological activity of a naturally occurring 80091 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of an 80091 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[5171] Anti-80091 Antibodies

[5172] In another aspect, the invention provides an anti-80091 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196: 901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[5173] The anti-80091 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[5174] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[5175] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 80091 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-80091 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)₂ fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341: 544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242: 423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85: 5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[5176] The anti-80091 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[5177] Phage display and combinatorial methods for generating anti-80091 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9: 1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3: 81-85; Huse et al. (1989) Science 246: 1275-1281; Griffths et al. (1993) EMBO J. 12: 725-734; Hawkins et al. (1992) J Mol Biol 226: 889-896; Clackson et al. (1991) Nature 352: 624-628; Gram et al. (1992) PNAS 89: 3576-3580; Garrad et al. (1991) Bio/Technology 9: 1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19: 4133-4137; and Barbas et al. (1991) PNAS 88: 7978-7982, the contents of all of which are incorporated by reference herein).

[5178] In one embodiment, the anti-80091 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[5179] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368: 856-859; Green, L. L. et al. 1994 Nature Genet. 7: 13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81: 6851-6855; Bruggeman et al. 1993 Year Immunol 7: 33-40; Tuaillon et al. 1993 PNAS 90: 3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[5180] An anti-80091 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[5181] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240: 1041-1043); Liu et al. (1987) PNAS 84: 3439-3443; Liu et al., 1987, J. Immunol. 139: 3521-3526; Sun et al. (1987) PNAS 84: 214-218; Nishimura et al., 1987, Canc. Res. 47: 999-1005; Wood et al. (1985) Nature 314: 446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80: 1553-1559).

[5182] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to an 80091 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immumoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[5183] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[5184] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229: 1202-1207, by Oi et al., 1986, BioTechniques 4: 214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against an 80091 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[5185] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321: 552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141: 4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[5186] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[5187] A full-length 80091 protein or, antigenic peptide fragment of 80091 can be used as an immunogen or can be used to identify anti-80091 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 80091 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:95 and encompasses an epitope of 80091. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[5188] Fragments of 80091 which include residues about 315 to 339, about 530 to 539, about 680 to 695, or about 1185 to 1220 can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody, antibodies against hydrophilic regions of the 80091 protein. Similarly, fragments of 80091 which include residues about 188 to 205, about 540 to 550, about 817 to 825, or about 980 to 995 can be used to make an antibody against a hydrophobic region of the 80091 protein; a fragment of 80091 which include residues about 447 to 478, or about 1219 to 1279 can be used to make an antibody against the ubiquitin carboxy-terminal hydrolase region of the 80091 protein.

[5189] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[5190] Antibodies which bind only native 80091 protein, only denatured or otherwise non-native 80091 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 80091 protein.

[5191] Preferred epitopes encompassed by the antigenic peptide are regions of 80091 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 80091 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 80091 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[5192] The anti-80091 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 80091 protein.

[5193] In a preferred embodiment the antibody has effector function and/or can fix complement. In other embodiments the antibody does not recruit effector cells; or fix complement.

[5194] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example, it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[5195] In a preferred embodiment, an anti-80091 antibody alters (e.g., increases or decreases) the de-ubiquitination of an 80091 polypeptide.

[5196] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[5197] An anti-80091 antibody (e.g., monoclonal antibody) can be used to isolate 80091 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-80091 antibody can be used to detect 80091 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-80091 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or ³H.

[5198] The invention also includes a nucleic acid which encodes an anti-80091 antibody, e.g., an anti-80091 antibody described herein. Also included are vectors which include the nucleic acid and cells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[5199] The invention also includes cell lines, e.g., hybridomas, which make an anti-80091 antibody, e.g., and antibody described herein, and method of using said cells to make an 80091 antibody.

[5200] 80091 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[5201] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[5202] A vector can include an 80091 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 80091 proteins, mutant forms of 80091 proteins, fusion proteins, and the like).

[5203] The recombinant expression vectors of the invention can be designed for expression of 80091 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[5204] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[5205] Purified fusion proteins can be used in 80091 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 80091 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[5206] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[5207] The 80091 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[5208] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[5209] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[5210] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1: 268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43: 235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8: 729-733) and immunoglobulins (Banerji et al. (1983) Cell 33: 729-740; Queen and Baltimore (1983) Cell 33: 741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86: 5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230: 912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249: 374-379) and the α-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3: 537-546).

[5211] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[5212] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., an 80091 nucleic acid molecule within a recombinant expression vector or an 80091 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[5213] A host cell can be any prokaryotic or eukaryotic cell. For example, an 80091 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells (African green monkey kidney cells CV-1 origin SV40 cells; Gluzman (1981) Cell 123: 175-182)). Other suitable host cells are known to those skilled in the art.

[5214] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[5215] A host cell of the invention can be used to produce (i.e., express) an 80091 protein. Accordingly, the invention further provides methods for producing an 80091 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding an 80091 protein has been introduced) in a suitable medium such that an 80091 protein is produced. In another embodiment, the method further includes isolating an 80091 protein from the medium or the host cell.

[5216] In another aspect, the invention features, a cell or purified preparation of cells which include an 80091 transgene, or which otherwise misexpress 80091. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include an 80091 transgene, e.g., a heterologous form of an 80091, e.g., a gene derived from humans (in the case of a non-human cell). The 80091 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 80091, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 80091 alleles or for use in drug screening.

[5217] In another aspect, the invention features, a human cell, transformed with nucleic acid which encodes a subject 80091 polypeptide.

[5218] Also provided are cells, preferably human cells, e.g., human hematopoietic or fibroblast cells, in which an endogenous 80091 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 80091 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 80091 gene. For example, an endogenous 80091 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[5219] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding an 80091 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 80091 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for an 80091 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[5220] 80091 Transgenic Animals

[5221] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of an 80091 protein and for identifying and/or evaluating modulators of 80091 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 80091 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[5222] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of an 80091 protein to particular cells. A transgenic founder animal can be identified based upon the presence of an 80091 transgene in its genome and/or expression of 80091 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding an 80091 protein can further be bred to other transgenic animals carrying other transgenes.

[5223] 80091 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[5224] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[5225] Uses of 80091

[5226] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[5227] The isolated nucleic acid molecules of the invention can be used, for example, to express an 80091 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect an 80091 mRNA (e.g., in a biological sample) or a genetic alteration in an 80091 gene, and to modulate 80091 activity, as described further below. The 80091 proteins can be used to treat disorders characterized by insufficient or excessive production of an 80091 substrate or production of 80091 inhibitors. In addition, the 80091 proteins can be used to screen for naturally occurring 80091 substrates, to screen for drugs or compounds which modulate 80091 activity, as well as to treat disorders characterized by insufficient or excessive production of 80091 protein or production of 80091 protein forms which have decreased, aberrant or unwanted activity compared to 80091 wild type protein (e.g., an erythroid-associated disorder). Moreover, the anti-80091 antibodies of the invention can be used to detect and isolate 80091 proteins, regulate the bioavailability of 80091 proteins, and modulate 80091 activity.

[5228] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 80091 polypeptide is provided. The method includes: contacting the compound with the subject 80091 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 80091 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 80091 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 80091 polypeptide. Screening methods are discussed in more detail below.

[5229] 80091 Screening Assays

[5230] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 80091 proteins, have a stimulatory or inhibitory effect on, for example, 80091 expression or 80091 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of an 80091 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 80091 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[5231] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of an 80091 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of an 80091 protein or polypeptide or a biologically active portion thereof.

[5232] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[5233] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90: 6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91: 11422; Zuckermann et al. (1994). J. Med. Chem. 37: 2678; Cho et al. (1993) Science 261: 1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33: 2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33: 2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[5234] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364: 555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249: 386-390; Devlin (1990) Science 249: 404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87: 6378-6382; Felici (1991) J. Mol. Biol. 222: 301-310; Ladner supra.).

[5235] In one embodiment, an assay is a cell-based assay in which a cell which expresses an 80091 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 80091 activity is determined. Determining the ability of the test compound to modulate 80091 activity can be accomplished by monitoring, for example, the de-ubiquitination. The cell, for example, can be of mammalian origin, e.g., human.

[5236] The ability of the test compound to modulate 80091 binding to a compound, e.g., an 80091 substrate, or to bind to 80091 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 80091 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 80091 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 80091 binding to an 80091 substrate in a complex. For example, compounds (e.g., 80091 substrates) can be labeled with ¹²⁵I, ³⁵S, ¹⁴C, or ³H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[5237] The ability of a compound (e.g., an 80091 substrate) to interact with 80091 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 80091 without the labeling of either the compound or the 80091. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 80091.

[5238] In yet another embodiment, a cell-free assay is provided in which an 80091 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 80091 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 80091 proteins to be used in assays of the present invention include fragments which participate in interactions with non-80091 molecules, e.g., fragments with high surface probability scores.

[5239] Soluble and/or membrane-bound forms of isolated proteins (e.g., 80091 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)_(n), 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[5240] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[5241] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[5242] In another embodiment, determining the ability of the 80091 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63: 2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5: 699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[5243] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[5244] It may be desirable to immobilize either 80091, an anti-80091 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to an 80091 protein, or interaction of an 80091 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/80091 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 80091 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 80091 binding or activity determined using standard techniques.

[5245] Other techniques for immobilizing either an 80091 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 80091 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[5246] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[5247] In one embodiment, this assay is performed utilizing antibodies reactive with 80091 protein or target molecules but which do not interfere with binding of the 80091 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 80091 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 80091 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 80091 protein or target molecule.

[5248] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11: 141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[5249] In a preferred embodiment, the assay includes contacting the 80091 protein or biologically active portion thereof with a known compound which binds 80091 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with an 80091 protein, wherein determining the ability of the test compound to interact with an 80091 protein includes determining the ability of the test compound to preferentially bind to 80091 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[5250] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 80091 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of an 80091 protein through modulation of the activity of a downstream effector of an 80091 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[5251] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[5252] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[5253] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[5254] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[5255] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[5256] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[5257] In yet another aspect, the 80091 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 80091 (“80091-binding proteins” or “80091-bp”) and are involved in 80091 activity. Such 80091-bps can be activators or inhibitors of signals by the 80091 proteins or 80091 targets as, for example, downstream elements of an 80091-mediated signaling pathway.

[5258] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for an 80091 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 80091 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming an 80091-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 80091 protein.

[5259] In another embodiment, modulators of 80091 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 80091 mRNA or protein evaluated relative to the level of expression of 80091 mRNA or protein in the absence of the candidate compound. When expression of 80091 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 80091 mRNA or protein expression. Alternatively, when expression of 80091 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 80091 mRNA or protein expression. The level of 80091 mRNA or protein expression can be determined by methods described herein for detecting 80091 mRNA or protein.

[5260] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of an 80091 protein can be confirmed in vivo, e.g., in an animal such as an animal model for an erythroid-associated disorder.

[5261] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., an 80091 modulating agent, an antisense 80091 nucleic acid molecule, an 80091-specific antibody, or an 80091-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[5262] 80091 Detection Assays

[5263] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 80091 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[5264] 80091 Chromosome Mapping

[5265] The 80091 nucleotide sequences or portions thereof can be used to map the location of the 80091 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 80091 sequences with genes associated with disease.

[5266] Briefly, 80091 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 80091 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 80091 sequences will yield an amplified fragment.

[5267] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220: 919-924).

[5268] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87: 6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 80091 to a chromosomal location.

[5269] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[5270] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[5271] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325: 783-787.

[5272] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 80091 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[5273] 80091 Tissue Typing

[5274] 80091 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[5275] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 80091 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[5276] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:94 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:94 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[5277] If a panel of reagents from 80091 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[5278] Use of Partial 80091 Sequences in Forensic Biology

[5279] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[5280] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:94 (e.g., fragments derived from the noncoding regions of SEQ ID NO:94 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[5281] The 80091 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 80091 probes can be used to identify tissue by species and/or by organ type.

[5282] In a similar fashion, these reagents, e.g., 80091 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[5283] Predictive Medicine of 80091

[5284] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[5285] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 80091.

[5286] Such disorders include, e.g., a disorder associated with the misexpression of 80091 gene; a disorder of the erythoid system.

[5287] The method includes one or more of the following:

[5288] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 80091 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[5289] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 80091 gene;

[5290] detecting, in a tissue of the subject, the misexpression of the 80091 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[5291] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of an 80091 polypeptide.

[5292] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 80091 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[5293] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:94, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 80091 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[5294] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 80091 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 80091.

[5295] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[5296] In preferred embodiments the method includes determining the structure of an 80091 gene, an abnormal structure being indicative of risk for the disorder.

[5297] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 80091 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[5298] Diagnostic and Prognostic Assays of 80091

[5299] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 80091 molecules and for identifying variations and mutations in the sequence of 80091 molecules.

[5300] Expression Monitoring and Profiling:

[5301] The presence, level, or absence of 80091 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 80091 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 80091 protein such that the presence of 80091 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 80091 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 80091 genes; measuring the amount of protein encoded by the 80091 genes; or measuring the activity of the protein encoded by the 80091 genes.

[5302] The level of mRNA corresponding to the 80091 gene in a cell can be determined both by in situ and by in vitro formats.

[5303] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 80091 nucleic acid, such as the nucleic acid of SEQ ID NO:94, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 80091 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[5304] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 80091 genes.

[5305] The level of mRNA in a sample that is encoded by one of 80091 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88: 189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87: 1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86: 1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6: 1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[5306] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 80091 gene being analyzed.

[5307] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 80091 mRNA, or genomic DNA, and comparing the presence of 80091 mRNA or genomic DNA in the control sample with the presence of 80091 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 80091 transcript levels.

[5308] A variety of methods can be used to determine the level of protein encoded by 80091. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)₂) can be used. The term “labeled,” with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[5309] The detection methods can be used to detect 80091 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 80091 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 80091 protein include introducing into a subject a labeled anti-80091 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-80091 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[5310] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 80091 protein, and comparing the presence of 80091 protein in the control sample with the presence of 80091 protein in the test sample.

[5311] The invention also includes kits for detecting the presence of 80091 in a biological sample. For example, the kit can include a compound or agent capable of detecting 80091 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 80091 protein or nucleic acid.

[5312] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[5313] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[5314] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 80091 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as an erythroid-associated disorder or deregulated cell proliferation.

[5315] In one embodiment, a disease or disorder associated with aberrant or unwanted 80091 expression or activity is identified. A test sample is obtained from a subject and 80091 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 80091 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 80091 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[5316] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 80091 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a disordered erythroid cell.

[5317] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 80091 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 80091 (e.g., other genes associated with an 80091-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[5318] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 80091 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose an erythroid-associated disorder in a subject wherein an increase in 80091 expression is an indication that the subject has or is disposed to having an erythroid-associated disorder. The method can be used to monitor a treatment for an erythroid-associated disorder in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[5319] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 80091 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[5320] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 80091 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[5321] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[5322] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 80091 expression.

[5323] 80091 Arrays and Uses Thereof

[5324] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to an 80091 molecule (e.g., an 80091 nucleic acid or an 80091 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm², and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[5325] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to an 80091 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 80091. Each address of the subset can include a capture probe that hybridizes to a different region of an 80091 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for an 80091 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 80091 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 80091 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[5326] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[5327] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to an 80091 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 80091 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-80091 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[5328] In another aspect, the invention features a method of analyzing the expression of 80091. The method includes providing an array as described above; contacting the array with a sample and detecting binding of an 80091-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[5329] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 80091. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 80091. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[5330] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 80091 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[5331] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[5332] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of an 80091-associated disease or disorder; and processes, such as a cellular transformation associated with an 80091-associated disease or disorder. The method can also evaluate the treatment and/or progression of an 80091-associated disease or disorder

[5333] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 80091) that could serve as a molecular target for diagnosis or therapeutic intervention.

[5334] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon an 80091 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to an 80091 polypeptide or fragment thereof. For example, multiple variants of an 80091 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[5335] The polypeptide array can be used to detect an 80091 binding compound, e.g., an antibody in a sample from a subject with specificity for an 80091 polypeptide or the presence of an 80091-binding protein or ligand.

[5336] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 80091 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[5337] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 80091 or from a cell or subject in which an 80091 mediated response has been elicited, e.g., by contact of the cell with 80091 nucleic acid or protein, or administration to the cell or subject 80091 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 80091 (or does not express as highly as in the case of the 80091 positive plurality of capture probes) or from a cell or subject which in which an 80091 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than an 80091 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[5338] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 80091 or from a cell or subject in which an 80091-mediated response has been elicited, e.g., by contact of the cell with 80091 nucleic acid or protein, or administration to the cell or subject 80091 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 80091 (or does not express as highly as in the case of the 80091 positive plurality of capture probes) or from a cell or subject which in which an 80091 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[5339] In another aspect, the invention features a method of analyzing 80091, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing an 80091 nucleic acid or amino acid sequence; comparing the 80091 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 80091.

[5340] Detection of 80091 Variations or Mutations

[5341] The methods of the invention can also be used to detect genetic alterations in an 80091 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 80091 protein activity or nucleic acid expression, such as an erythroid-associated disorder. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding an 80091-protein, or the mis-expression of the 80091 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from an 80091 gene; 2) an addition of one or more nucleotides to an 80091 gene; 3) a substitution of one or more nucleotides of an 80091 gene, 4) a chromosomal rearrangement of an 80091 gene; 5) an alteration in the level of a messenger RNA transcript of an 80091 gene, 6) aberrant modification of an 80091 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of an 80091 gene, 8) a non-wild type level of an 80091-protein, 9) allelic loss of an 80091 gene, and 10) inappropriate post-translational modification of an 80091-protein.

[5342] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 80091-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to an 80091 gene under conditions such that hybridization and amplification of the 80091-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[5343] In another embodiment, mutations in an 80091 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[5344] In other embodiments, genetic mutations in 80091 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of an 80091 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of an 80091 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 80091 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[5345] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 80091 gene and detect mutations by comparing the sequence of the sample 80091 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[5346] Other methods for detecting mutations in the 80091 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85: 4397; Saleeba et al. (1992) Methods Enzymol. 217: 286-295).

[5347] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 80091 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[5348] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 80091 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285: 125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9: 73-79). Single-stranded DNA fragments of sample and control 80091 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7: 5).

[5349] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[5350] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[5351] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88: 189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′ sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[5352] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 65%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to an 80091 nucleic acid.

[5353] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:94 or the complement of SEQ ID NO:94. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[5354] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 80091. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[5355] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the T_(m) of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[5356] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, an 80091 nucleic acid.

[5357] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving an 80091 gene.

[5358] Use of 80091 Molecules as Surrogate Markers

[5359] The 80091 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 80091 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 80091 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[5360] The 80091 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., an 80091 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-80091 antibodies may be employed in an immune-based detection system for an 80091 protein marker, or 80091-specific radiolabeled probes may be used to detect an 80091 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[5361] The 80091 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35: 1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 80091 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 80091 DNA may correlate 80091 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[5362] Pharmaceutical Compositions of 80091

[5363] The nucleic acid and polypeptides, fragments thereof, as well as anti-80091 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[5364] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[5365] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[5366] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[5367] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[5368] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[5369] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[5370] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[5371] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[5372] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[5373] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD₅₀ (the dose lethal to 50% of the population) and the ED₅₀ (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD₅₀/ED₅₀. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[5374] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED₅₀ with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC₅₀ (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[5375] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[5376] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14: 193).

[5377] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[5378] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[5379] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, CC-1065, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine, vinblastine, taxol and maytansinoids). Radioactive ions include, but are not limited to iodine, yttrium and praseodymium.

[5380] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, α-interferon, β-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[5381] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[5382] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[5383] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[5384] Methods of Treatment for 80091

[5385] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 80091 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[5386] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 80091 molecules of the present invention or 80091 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[5387] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 80091 expression or activity, by administering to the subject an 80091 or an agent which modulates 80091 expression or at least one 80091 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 80091 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 80091 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 80091 aberrance, for example, an 80091, 80091 agonist or 80091 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[5388] It is possible that some 80091 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[5389] The 80091 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of hematopoietic disorders such as erythroid cell-associated disorders, cellular proliferative and/or differentiative disorders, neurological or brain disorders, metabolic disorders, angiogenic disorders, and endothelial cell disorders, as described above, as well as one or more of disorders associated with bone metabolism, immune disorders, cardiovascular disorders, liver disorders, viral diseases, or pain disorders.

[5390] The 80091 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of disorders associated with bone metabolism, immune disorders, cardiovascular disorders, liver disorders, viral diseases, pain or metabolic disorders.

[5391] Aberrant expression and/or activity of 80091 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 80091 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 80091 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 80091 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[5392] The 80091 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[5393] Examples of disorders involving the heart or “cardiovascular disorder” include, but are not limited to, a disease, disorder, or state involving the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. Examples of such disorders include hypertension, atherosclerosis, coronary artery spasm, congestive heart failure, coronary artery disease, valvular disease, arrhythmias, and cardiomyopathies.

[5394] Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[5395] Additionally, 80091 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 80091 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 80091 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[5396] Additionally, 80091 may play an important role in the regulation of metabolism or pain disorders. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[5397] As discussed, successful treatment of 80091 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 80091 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)₂ and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[5398] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[5399] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[5400] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 80091 expression is through the use of aptamer molecules specific for 80091 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1: 32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 80091 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[5401] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 80091 disorders. For a description of antibodies, see the Antibody section above.

[5402] In circumstances wherein injection of an animal or a human subject with an 80091 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 80091 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 80091 protein. Vaccines directed to a disease characterized by 80091 expression may also be generated in this fashion.

[5403] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[5404] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 80091 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[5405] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED₅₀ with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC₅₀ (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography. Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 80091 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7: 89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2: 166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 80091 can be readily monitored and used in calculations of IC₅₀.

[5406] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC₅₀. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67: 2142-2144.

[5407] Another aspect of the invention pertains to methods of modulating 80091 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with an 80091 or agent that modulates one or more of the activities of 80091 protein activity associated with the cell. An agent that modulates 80091 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of an 80091 protein (e.g., an 80091 substrate or receptor), an 80091 antibody, an 80091 agonist or antagonist, a peptidomimetic of an 80091 agonist or antagonist, or other small molecule.

[5408] In one embodiment, the agent stimulates one or 80091 activities. Examples of such stimulatory agents include active 80091 protein and a nucleic acid molecule encoding 80091. In another embodiment, the agent inhibits one or more 80091 activities. Examples of such inhibitory agents include antisense 80091 nucleic acid molecules, anti-80091 antibodies, and 80091 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of an 80091 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 80091 expression or activity. In another embodiment, the method involves administering an 80091 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 80091 expression or activity.

[5409] Stimulation of 80091 activity is desirable in situations in which 80091 is abnormally downregulated and/or in which increased 80091 activity is likely to have a beneficial effect. For example, stimulation of 80091 activity is desirable in situations in which an 80091 is downregulated and/or in which increased 80091 activity is likely to have a beneficial effect. Likewise, inhibition of 80091 activity is desirable in situations in which 80091 is abnormally upregulated and/or in which decreased 80091 activity is likely to have a beneficial effect.

[5410] 80091 Pharmacogenomics

[5411] The 80091 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 80091 activity (e.g., 80091 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 80091 associated disorders (e.g., an erythroid-associated disorder) associated with aberrant or unwanted 80091 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer an 80091 molecule or 80091 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with an 80091 molecule or 80091 modulator.

[5412] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[5413] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[5414] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., an 80091 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[5415] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., an 80091 molecule or 80091 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[5416] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with an 80091 molecule or 80091 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[5417] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 80091 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 80091 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[5418] Monitoring the influence of agents (e.g., drugs) on the expression or activity of an 80091 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 80091 gene expression, protein levels, or upregulate 80091 activity, can be monitored in clinical trials of subjects exhibiting decreased 80091 gene expression, protein levels, or downregulated 80091 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 80091 gene expression, protein levels, or downregulate 80091 activity, can be monitored in clinical trials of subjects exhibiting increased 80091 gene expression, protein levels, or upregulated 80091 activity. In such clinical trials, the expression or activity of an 80091 gene, and preferably, other genes that have been implicated in, for example, an 80091-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[5419] 80091 Informatics

[5420] The sequence of an 80091 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains an 80091. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 80091 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[5421] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[5422] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[5423] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[5424] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[5425] Thus, in one aspect, the invention features a method of analyzing 80091, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing an 80091 nucleic acid or amino acid sequence; comparing the 80091 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 80091. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[5426] The method can include evaluating the sequence identity between an 80091 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[5427] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[5428] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[5429] Thus, the invention features a method of making a computer readable record of a sequence of an 80091 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[5430] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing an 80091 sequence, or record, in machine-readable form; comparing a second sequence to the 80091 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 80091 sequence includes a sequence being compared. In a preferred embodiment the 80091 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 80091 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[5431] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has an 80091-associated disease or disorder or a pre-disposition to an 80091-associated disease or disorder, wherein the method comprises the steps of determining 80091 sequence information associated with the subject and based on the 80091 sequence information, determining whether the subject has an 80091-associated disease or disorder or a pre-disposition to an 80091-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[5432] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has an 80091-associated disease or disorder or a pre-disposition to a disease associated with an 80091 wherein the method comprises the steps of determining 80091 sequence information associated with the subject, and based on the 80091 sequence information, determining whether the subject has an 80091-associated disease or disorder or a pre-disposition to an 80091-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 80091 sequence of the subject to the 80091 sequences in the database to thereby determine whether the subject as an 80091-associated disease or disorder, or a pre-disposition for such.

[5433] The present invention also provides in a network, a method for determining whether a subject has an 80091 associated disease or disorder or a pre-disposition to an 80091-associated disease or disorder associated with 80091, said method comprising the steps of receiving 80091 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 80091 and/or corresponding to an 80091-associated disease or disorder (e.g., an erythroid-associated disorder), and based on one or more of the phenotypic information, the 80091 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has an 80091-associated disease or disorder or a pre-disposition to an 80091-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[5434] The present invention also provides a method for determining whether a subject has an 80091-associated disease or disorder or a pre-disposition to an 80091-associated disease or disorder, said method comprising the steps of receiving information related to 80091 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 80091 and/or related to an 80091-associated disease or disorder, and based on one or more of the phenotypic information, the 80091 information, and the acquired information, determining whether the subject has an 80091-associated disease or disorder or a pre-disposition to an 80091-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[5435] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 46508 Invention

[5436] Peptidyl-tRNA hydrolases are important cellular enzymes that cleave peptidyl-tRNA molecules that are prematurely released from ribosomes as a result of abortive translation events. The esterase activity of peptidyl-tRNA hydrolases cleaves the covalent bond between the nascent peptide and the tRNA. Such cleavage results in the recycling of tRNA. The peptidyl-tRNA hydrolase from E. coli is well characterized, and homologous proteins are found in many eubacterial species. In E. coli, the gene encoding peptidyl-tRNA hydrolase is essential. Further, the required level of peptidyl-tRNA hydrolase activity for viability is escalated under conditions that increase premature translational termination such as exposure to antibiotics (Menninger and Coleman (1993) Antimicrob. Agents Chemother. 37:2027-2029.) and reduced when tRNAs particularly prone to dissociate from the ribosome are supplied in excess (Heurgue-Hamard et al. (1996) EMBO J. 15:2826-2833).

[5437] The x-ray crystal structure of E. coli peptidyl-tRNA hydrolase was determined at high resolution (Schmitt et al. (1997) EMBO J. 16:4760-4769). The monomeric protein contains single monomeric α/β globular domain of seven β-strands and six α-helices. The peptidyl-tRNA hydrolase enzyme structure has structural similarity to an aminopeptidase from Aeromonas proteolytica (GenPept:640150) (Chevrier, B. et al. (1994) Structure 2:283-291) and to a lesser extent to bovine purine nucleoside phosphorylase (GenPept:2624420) (Koellner, G. et al. (1997) J. Mol. Biol. 265:202-216.). Genetic data and structural analysis indicate that three residues, asparagine 10, histidine 20, and aspartic acid 93 in the E. coli enzyme are critical residues for catalysis. In addition, asparagine 68 and asparagine 114 of the E. coli enzyme are poised to make favorable electrostatic contacts with the peptide region of the peptidyl-tRNA substrate whereas arginine 133 of the E. coli sequence may contact the tRNA portion of the substrate. The amino acid identities of these positions are conserved in alignments of eubacterial peptidyl-tRNA transferases.

Summary of the 46508 Invention

[5438] The present invention is based, in part, on the discovery of a novel peptidyl-tRNA hydrolase, referred to herein as “46508”. The nucleotide sequence of a cDNA encoding 46508 is shown in SEQ ID NO:101, and the amino acid sequence of a 46508 polypeptide is shown in SEQ ID NO: 102. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:103.

[5439] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 46508 protein or polypeptide, e.g., a biologically active portion of the 46508 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO: 102. In other embodiments, the invention provides isolated 46508 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:101, SEQ ID NO:103. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:101, SEQ ID NO:103. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:101, SEQ ID NO:103, wherein the nucleic acid encodes a full length 46508 protein or an active fragment thereof.

[5440] In a related aspect, the invention further provides nucleic acid constructs that include a 46508 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 46508 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 46508 nucleic acid molecules and polypeptides.

[5441] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 46508-encoding nucleic acids.

[5442] In still another related aspect, isolated nucleic acid molecules that are antisense to a 46508 encoding nucleic acid molecule are provided.

[5443] In another aspect, the invention features, 46508 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 46508-mediated or -related disorders. In another embodiment, the invention provides 46508 polypeptides having a 46508 activity. Preferred polypeptides are 46508 proteins including at least one peptidyl-tRNA hydrolase domain, and preferably, having a 46508 activity, e.g., a 46508 activity as described herein.

[5444] In other embodiments, the invention provides 46508 polypeptides, e.g., a 46508 polypeptide having the amino acid sequence shown in SEQ ID NO:102; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO: 102; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 101, SEQ ID NO: 103, wherein the nucleic acid encodes a full length 46508 protein or an active fragment thereof.

[5445] In a related aspect, the invention further provides nucleic acid constructs which include a 46508 nucleic acid molecule described herein.

[5446] In a related aspect, the invention provides 46508 polypeptides or fragments operatively linked to non-46508 polypeptides to form fusion proteins.

[5447] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 46508 polypeptides or fragments thereof (e.g., a peptidyl-tRNA hydrolase domain). In one embodiment, the antibodies or antigen-binding fragment thereof competitively inhibit the binding of a second antibody to a 46508 polypeptide or a fragment thereof.

[5448] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 46508 polypeptides or nucleic acids.

[5449] In still another aspect, the invention provides a process for modulating 46508 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 46508 polypeptides or nucleic acids, such as conditions, e.g., disorders or diseases, involving aberrant or deficient cellular proliferation or differentiation, e.g., ovarian, colon, lung primary or metastatic tumors; cardiovascular disorders, e.g., endothelial cell, or smooth muscle cell, disorders; and neural, e.g., brain, disorders.

[5450] In yet another aspect, the invention provides methods for inhibiting the aberrant activity of a 46508-expressing cell, e.g., a hyper-proliferative 46508-expressing cell. The method includes contacting the cell with an agent, e.g., a compound (e.g., a compound identified using the methods described herein), that modulates the activity, or expression, of the 46508 polypeptide or nucleic acid.

[5451] In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol.

[5452] In a preferred embodiment, the cell is a hyperproliferative cell, e.g., a cell found in a solid tumor, a soft tissue tumor, or a metastatic lesion, e.g., an ovarian, colon, lung primary or metastatic tumor. In other embodiments, the cell is a cardiovascular cell, e.g., an endothelial cell, or smooth muscle cell; or a neural, e.g., brain, cell.

[5453] In a preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 46508 polypeptide. For example, the agent inhibits the proliferation, or induces the killing of a 46508-expressing cell. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 46508 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[5454] In a preferred embodiment, the agent, e.g., the compound, is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[5455] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant activity, e.g., cellular proliferation or differentiation, of a 46508-expressing cell, in a subject. Preferably, the method includes comprising administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 46508 polypeptide or nucleic acid. In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition.

[5456] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., a proliferative, cardiovascular, or a neural disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 46508 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 46508 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 46508 nucleic acid or polypeptide expression can be detected by any method described herein.

[5457] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 46508 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[5458] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 46508 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 46508 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 46508 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue.

[5459] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 46508 polypeptide or nucleic acid molecule, including for disease diagnosis.

[5460] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 46508 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 46508 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 46508 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[5461] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 46508

[5462] The human 46508 sequence (see SEQ ID NO:101, as recited in Example 58), which is approximately 1182 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 684 nucleotides, including the termination codon. The coding sequence encodes a 227 amino acid protein (see SEQ ID NO:102, as recited in Example 58). Human 46508 contains the following regions or other structural features:

[5463] a peptidyl-tRNA hydrolase domain (PFAM Accession PF01195) located at about amino acid residues 44 to 221 of SEQ ID NO:102;

[5464] two Protein Kinase C sites (PS00005) at about amino acids 13 to 15, and 150 to 152 of SEQ ID NO: 102;

[5465] two Casein Kinase II sites (PS00006) located at about amino acids 125 to 128, and 194 to 197 of SEQ ID NO:102; and

[5466] seven N-myristoylation sites (PS00008) located at about amino acids 4 to 9, 17 to 22, 23 to 28, 53 to 58, 74 to 79, 149 to 154, and 156 to 161 of SEQ ID NO:102; and

[5467] one amidation site (PS00009) located at about amino acid 40 to 43 of SEQ ID NO:102; and

[5468] one glycosaminoglycan attachment site (PS00002) located at about amino acids 3 to 6 of SEQ ID NO: 102.

[5469] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[5470] A plasmid containing the nucleotide sequence encoding human 46508 (clone “Fbh46508FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[5471] The 46508 protein contains a significant number of structural characteristics in common with members of the peptidyl-tRNA hydrolase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[5472] Peptidyl-tRNA hydrolases are a family of enzymes which hydrolyze ester linkages between the peptide moiety and the tRNA of peptidyl-tRNAs (Kössel, H (1969) Biochim. Biophys. Acta 204:191-202; Garcia-Villegas, M. R. (1991) EMBO J. 10:3549-3555). Such substrates are the products of premature release of nascent polypeptides from the ribosome. Cleavage of the bond between the peptide and tRNA allows the tRNA to return into the tRNA pool for subsequent charging and reuse. A conserved asparagine, a conserved histidine, and a conserved aspartic acid, residues 10, 20, and 93 of the E. coli peptidyl-tRNA hydrolase respectively, are required to catalyze this cleavage reaction. The 46508 polypeptide (SEQ ID NO:102) has these conserved residues, respectively: an asparagine at position 51, a histidine at position 59, and an aspartic acid at position 134 of SEQ ID NO:102. In addition, a conserved asparagines and a conserved arginine, residues 68, 114, and 133 of the E. coli peptidyl-tRNA hydrolase, respectively, contribute to the specificity of substrate recognition. The 46508 polypeptide (SEQ ID NO: 103) also has these conserved residues as shown in the alignment shown in FIG. 43, namely, an asparagine at position 109, an asparagine at position 155, and an arginine at position 173 of SEQ ID NO:102.

[5473] Cells which utilize the translation machinery more intensely than quiescent cells, e.g. rapidly growing cells, environmentally stressed cells, and virally infected cells, are likely to produce more peptidyl-tRNA substrates. Further, because of the increased translation activity, such cells also require larger tRNA pools than quiescent cells either in their cytoplasm or mitochondria, or both. Accordingly the activity of peptidyl-tRNA hydrolase enzymes may be required by such cells. Thus, inhibition of 46508 activity might be a successful route to treatment of a variety of disorders, including but not limited to, cell proliferation, cell differentiation, viral infection, and metabolism.

[5474] A 46508 polypeptide can include a “peptidyl-tRNA hydrolase domain” or regions homologous with a “peptidyl-tRNA hydrolase domain”. A 46508 polypeptide can optionally further include at least one glycosaminoglycan attachment site; at least one, preferably two, protein kinase C phosphorylation sites; at least one, preferable two, casein kinase II phosphorylation sites; at least one, two, three, four, five, six, preferably seven, N-myristoylation sites; and at least one amidation site.

[5475] As used herein, the term “peptidyl-tRNA hydrolase domain” includes an amino acid sequence of about 160 to 240 amino acid residues in length and having a bit score for the alignment of the sequence to the peptidyl-tRNA hydrolase domain profile (Pfam HMM) of at least 80. Preferably, the peptidyl-tRNA hydrolase domain has an amino acid sequence of about 170 to about 200 amino acids, more preferable about 170 to 190 amino acids, or about 177 amino acids, and has a bit score for the alignment of the sequence to the peptidyl-tRNA hydrolase domain (HMM) of at least 100, preferably of at least 120, more preferably of at least 130 or greater. Preferably, the peptidyl-tRNA hydrolase domain further includes the following highly conserved residues: one, preferably two, more preferably three asparagine residues, a histidine residue, an aspartic acid, and an arginine corresponding respectively to asparagine 51, asparagine 109, asparagine 155, histidine 59, aspartic acid 134, and arginine 173 of SEQ ID NO:102. The peptidyl-tRNA hydrolase domain (HMM) has been assigned the PFAM Accession (PF01195) (http://genome.wustl.edu/Pfam/html). An alignment of the peptidyl-tRNA hydrolase domain (from about amino acids 44 to about 221 of SEQ ID NO:102) of human 46508 with a consensus amino acid sequence derived from a hidden Markov model (PFAM) is depicted in FIG. 43.

[5476] In a preferred embodiment 46508 polypeptide or protein has a “peptidyl-tRNA hydrolase domain” or a region which includes at least about 120 to about 200 amino acids, more preferably about 160 to 190, 170 to 180, or about 177 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “peptidyl-tRNA hydrolase domain,” e.g., the peptidyl-tRNA hydrolase domain of human 46508 (e.g., residues 44 to 221 of SEQ ID NO:102).

[5477] To identify the presence of a “peptidyl-tRNA hydrolase” domain in a 46508 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against the Pfam database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of a “peptidyl-tRNA hydrolase” domain in the amino acid sequence of human 46508 at about residues 44 to about 221 of SEQ ID NO:102 (see FIG. 43).

[5478] A 46508 family member can include at least one peptidyl-tRNA hydrolase domain or regions homologous with a peptidyl-tRNA hydrolase domain. Furthermore, a 46508 family member can include at least one, preferably two, protein kinase C phosphorylation sites (PS00005); at least one, preferable two, casein kinase II phosphorylation sites (PS00006); at least one, two, three, four, five, six, preferably seven, N-myristoylation sites; and at least one amidation site.

[5479] As the 46508 polypeptides of the invention may modulate 46508-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 46508-mediated or related disorders, as described below.

[5480] As used herein, a “46508 activity”, “biological activity of 46508” or “functional activity of 46508”, refers to an activity exerted by a 46508 protein, polypeptide or nucleic acid molecule. For example, a 46508 activity can be an activity exerted by 46508 in a physiological milieu on, e.g., a 46508-responsive cell or on a 46508 substrate, e.g., a protein substrate. A 46508 activity can be determined in vivo or in vitro. In one embodiment, a 46508 activity is a direct activity, such as an association with a 46508 target molecule. A “target molecule” or “binding partner” is a molecule with which a 46508 protein binds or interacts in nature. In an another embodiment, 46508 activity can also be an indirect activity, e.g. a cellular signaling activity mediated by interaction of the 46508 protein with a second protein or with a nucleic acid.

[5481] The features of the 46508 molecules of the present invention can provide similar biological activities as peptidyl-tRNA hydrolase family members. For example, the 46508 proteins of the present invention can have one or more of the following activities: (1) ability to bind tRNA; (2) ability to bind peptide fragments; (3) ability to bind peptidyl-tRNAs; (4) ability to hydrolyze covalent bond between peptide and tRNA within peptidyl-tRNAs; or (5) ability to modulate translational efficiency. The 46508 polypeptide may perform one or more of these properties in the milieu of the cell cytoplasm and/or of the cell mitochondria.

[5482] As shown in the Examples below, increased 46508 mRNA expression is detected in a variety of malignant and non-malignant tissues, including cardiovascular tissues (e.g., endothelial cells, coronary smooth muscle cells), pancreas, neural tissues (e.g., brain, hypothalamus, DRG), skin, immune, e.g., erythroid cells, as well as a number of primary and metastatic tumors, e.g., ovarian, breast, and lung tumors. Thus, the 46508 molecules can act as novel diagnostic targets and therapeutic agents for controlling disorders of involving aberrant activity of those cells, e.g., cell proliferative disorders (e.g., cancer), cardiovascular disorders, neurological disorders, pancreatic disorders, skin and immune, e.g., erythroid, disorders.

[5483] High transcriptional expression of 46508 was observed in tumor samples compared to normal organ control samples. For example, high expression was observed in 5/5 primary ovarian tumor samples, 4/4 primary colon tumor samples, 2/2 colon to liver metastases, and 3/6 primary lung tumor samples. Additionally, high expression was observed in proliferating HMVEC cells when compared to arrested HMVEC cells. Therefore, 46508 may mediate or be involved in cellular proliferative and/or differentiative disorders.

[5484] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[5485] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth. Examples of such cells include cells having an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[5486] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[5487] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[5488] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[5489] Examples of cellular proliferative and/or differentiative disorders of the breast include, but are not limited to, proliferative breast disease including, e.g., epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors, e.g., stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[5490] Examples of cellular proliferative and/or differentiative disorders of the lung include, but are not limited to, bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[5491] Examples of cellular proliferative and/or differentiative disorders of the colon include, but are not limited to, non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors.

[5492] Examples of cellular proliferative and/or differentiative disorders of the liver include, but are not limited to, nodular hyperplasias, adenomas, and malignant tumors, including primary carcinoma of the liver and metastatic tumors.

[5493] Examples of cellular proliferative and/or differentiative disorders of the ovary include, but are not limited to, ovarian tumors such as, tumors of coelomic epithelium, serous tumors, mucinous tumors, endometeriod tumors, clear cell adenocarcinoma, adenofibroma, brenner tumor, surface epithelial tumors; germ cell tumors such as mature (benign) teratomas, monodermal teratomas, immature malignant teratomas, dysgerminoma, endodermal sinus tumor, choriocarcinoma; sex cord-stomal tumors such as, granulosa-theca cell tumors, thecoma-fibromas, androblastomas, hill cell tumors, and gonadoblastoma; and metastatic tumors such as Krukenberg tumors.

[5494] Examples of cancers or neoplastic conditions, in addition to the ones described above, include, but are not limited to, a fibrosarcoma, myosarcoma, liposarcoma, chondrosarcoma, osteogenic sarcoma, chordoma, angiosarcoma, endotheliosarcoma, lymphangiosarcoma, lymphangioendotheliosarcoma, synovioma, mesothelioma, Ewing's tumor, leiomyosarcoma, rhabdomyosarcoma, gastric cancer, esophageal cancer, rectal cancer, pancreatic cancer, ovarian cancer, prostate cancer, uterine cancer, cancer of the head and neck, skin cancer, brain cancer, squamous cell carcinoma, sebaceous gland carcinoma, papillary carcinoma, papillary adenocarcinoma, cystadenocarcinoma, medullary carcinoma, bronchogenic carcinoma, renal cell carcinoma, hepatoma, bile duct carcinoma, choriocarcinoma, seminoma, embryonal carcinoma, Wilm's tumor, cervical cancer, testicular cancer, small cell lung carcinoma, non-small cell lung carcinoma, bladder carcinoma, epithelial carcinoma, glioma, astrocytoma, medulloblastoma, craniopharyngioma, ependymoma, pinealoma, hemangioblastoma, acoustic neuroma, oligodendroglioma, meningioma, melanoma, neuroblastoma, retinoblastoma, leukemia, lymphoma, or Kaposi sarcoma.

[5495] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin. A hematopoietic neoplastic disorder can arise from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[5496] Moderate expression of 46508 was observed in normal heart tissue samples, and thus 46508 may mediate disorders involving the heart, e.g. cardiovascular disorders. As used herein, disorders involving the heart, or “cardiovascular disease” or a “cardiovascular disorder” includes a disease or disorder which affects the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. A cardiovascular disorder includes, but is not limited to disorders such as arteriosclerosis, atherosclerosis, cardiac hypertrophy, ischemia reperfusion injury, restenosis, arterial inflammation, vascular wall remodeling, ventricular remodeling, rapid ventricular pacing, coronary microembolism, tachycardia, bradycardia, pressure overload, aortic bending, coronary artery ligation, vascular heart disease, valvular disease, including but not limited to, valvular degeneration caused by calcification, rheumatic heart disease, endocarditis, or complications of artificial valves; atrial fibrillation, long-QT syndrome, congestive heart failure, sinus node dysfunction, angina, heart failure, hypertension, atrial fibrillation, atrial flutter, pericardial disease, including but not limited to, pericardial effusion and pericarditis; cardiomyopathies, e.g., dilated cardiomyopathy or idiopathic cardiomyopathy, myocardial infarction, coronary artery disease, coronary artery spasm, ischemic disease, arrhythmia, sudden cardiac death, and cardiovascular developmental disorders (e.g., arteriovenous malformations, arteriovenous fistulae, raynaud's syndrome, neurogenic thoracic outlet syndrome, causalgia/reflex sympathetic dystrophy, hemangioma, aneurysm, cavernous angioma, aortic valve stenosis, atrial septal defects, atrioventricular canal, coarctation of the aorta, ebsteins anomaly, hypoplastic left heart syndrome, interruption of the aortic arch, mitral valve prolapse, ductus arteriosus, patent foramen ovale, partial anomalous pulmonary venous return, pulmonary atresia with ventricular septal defect, pulmonary atresia without ventricular septal defect, persistance of the fetal circulation, pulmonary valve stenosis, single ventricle, total anomalous pulmonary venous return, transposition of the great vessels, tricuspid atresia, truncus arteriosus, ventricular septal defects). A cardiovasular disease or disorder also can include an endothelial cell disorder.

[5497] As used herein, an “endothelial cell disorder” includes a disorder characterized by aberrant, unregulated, or unwanted endothelial cell activity, e.g., proliferation, migration, angiogenesis, or vascularization; or aberrant expression of cell surface adhesion molecules or genes associated with angiogenesis, e.g., TIE-2, FLT and FLK. Endothelial cell disorders include tumorigenesis, tumor metastasis, psoriasis, diabetic retinopathy, endometriosis, Grave's disease, ischemic disease (e.g., atherosclerosis), and chronic inflammatory diseases (e.g., rheumatoid arthritis).

[5498] Additionally, 46508 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 46508 activity could be used to control viral diseases, particularly because viral pathogens subvert the host cell translation machinery in order to translate viral genes. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 46508 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[5499] High to moderate expression of 46508 was observed in normal pancreas tissue, and thus, 46508 may mediate disorders involving the pancreas, e.g. pancreatic disorders. Disorders involving the pancreas include those of the exocrine pancreas such as congenital anomalies, including but not limited to, ectopic pancreas; pancreatitis, including but not limited to, acute pancreatitis; cysts, including but not limited to, pseudocysts; tumors, including but not limited to, cystic tumors and carcinoma of the pancreas; and disorders of the endocrine pancreas such as, diabetes mellitus; islet cell tumors, including but not limited to, insulinomas, gastrinomas, and other rare islet cell tumors.

[5500] Moderate to high expression of 46508 mRNA was observed in the normal brain cortex and hypothalamus. Therefore, 46508 may mediate disorders involving these cells, e.g. disorders of the brain. Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B₁) deficiency and vitamin B₁₂ deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[5501] 46508 mRNA expression was also detected in the dorsal root ganglia (DRG). Therefore, 46508-associated disorders can detrimentally affect regulation and modulation of the pain response; and vasoconstriction, inflammatory response and pain therefrom. Examples of such disorders in which the 46508 molecules of the invention may be directly or indirectly involved include pain, pain syndromes, and inflammatory disorders, including inflammatory pain. Examples of pain conditions include, but are not limited to, pain elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia; pain associated with musculoskeletal disorders, e.g., joint pain, or arthritis; tooth pain; headaches, e.g., migrane; pain associated with surgery; pain related to inflammation, e.g., irritable bowel syndrome; chest pain; or hyperalgesia, e.g., excessive sensitivity to pain (described in, for example, Fields (1987) Pain, New York: McGraw-Hill). Other examples of pain disorders or pain syndromes include, but are not limited to, complex regional pain syndrome (CRPS), reflex sympathetic dystrophy (RSD), causalgia, neuralgia, central pain and dysesthesia syndrome, carotidynia, neurogenic pain, refractory cervicobrachial pain syndrome, myofascial pain syndrome, craniomandibular pain dysfunction syndrome, chronic idiopathic pain syndrome, Costen's pain-dysfunction, acute chest pain syndrome, nonulcer dyspepsia, interstitial cystitis, gynecologic pain syndrome, patellofemoral pain syndrome, anterior knee pain syndrome, recurrent abdominal pain in children, colic, low back pain syndrome, neuropathic pain, phantom pain from amputation, phantom tooth pain, or pain asymbolia (the inability to feel pain). Other examples of pain conditions include pain induced by parturition, or post partum pain.

[5502] Moderate to high expression of 46508 mRNA was also detected in the skin. Diseases of the skin, include but are not limited to, disorders of pigmentation and melanocytes, including but not limited to, vitiligo, freckle, melasma, lentigo, nevocellular nevus, dysplastic nevi, and malignant melanoma; benign epithelial tumors, including but not limited to, seborrheic keratoses, acanthosis nigricans, fibroepithelial polyp, epithelial cyst, keratoacanthoma, and adnexal (appendage) tumors; premalignant and malignant epidermal tumors, including but not limited to, actinic keratosis, squamous cell carcinoma, basal cell carcinoma, and merkel cell carcinoma; tumors of the dermis, including but not limited to, benign fibrous histiocytoma, dermatofibrosarcoma protuberans, xanthomas, and dermal vascular tumors; tumors of cellular immigrants to the skin, including but not limited to, histiocytosis X, mycosis fungoides (cutaneous T-cell lymphoma), and mastocytosis; disorders of epidermal maturation, including but not limited to, ichthyosis; acute inflammatory dermatoses, including but not limited to, urticaria, acute eczematous dermatitis, and erythema multiforme; chronic inflammatory dermatoses, including but not limited to, psoriasis, lichen planus, and lupus erythematosus; blistering (bullous) diseases, including but not limited to, pemphigus, bullous pemphigoid, dermatitis herpetiformis, and noninflammatory blistering diseases: epidermolysis bullosa and porphyria; disorders of epidermal appendages, including but not limited to, acne vulgaris; panniculitis, including but not limited to, erythema nodosum and erythema induratum; and infection and infestation, such as verrucae, molluscum contagiosum, impetigo, superficial fungal infections, and arthropod bites, stings, and infestations.

[5503] Moderate expression of 46508 mRNA was detected in both normal and tumor breast tissue samples, and thus may mediate disorders of the breast. Disorders of the breast include, but are not limited to, disorders of development; inflammations, including but not limited to, acute mastitis, periductal mastitis, periductal mastitis (recurrent subareolar abscess, squamous metaplasia of lactiferous ducts), mammary duct ectasia, fat necrosis, granulomatous mastitis, and pathologies associated with silicone breast implants; fibrocystic changes; proliferative breast disease including, but not limited to, epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors including, but not limited to, stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget 's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, no special type, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms.

[5504] Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[5505] Normal and tumorous samples of ovarian tissue also showed expression of 46508 mRNA. Various data indicates that 46508 is highly expressed in several ovarian cell lines, including SKOV3/Var, A2780, MDA 2774 and ES-2. Disorders involving the ovary include, for example, polycystic ovarian disease, Stein-leventhal syndrome, Pseudomyxoma peritonei and stromal hyperthecosis; ovarian tumors such as, tumors of coelomic epithelium, serous tumors, mucinous tumors, endometeriod tumors, clear cell adenocarcinoma, cystadenofibroma, brenner tumor, surface epithelial tumors; germ cell tumors such as mature (benign) teratomas, monodermal teratomas, immature malignant teratomas, dysgerminoma, endodermal sinus tumor, choriocarcinoma; sex cord-stomal tumors such as, granulosa-theca cell tumors, thecoma-fibromas, androblastomas, hill cell tumors, and gonadoblastoma; and metastatic tumors such as Krukenberg tumors.

[5506] Moderate expression of 46508 mRNA was also noted in normal colon tissue, and high expression was noted in colon tumor samples. Thus 46508 may mediate diseases involving the colon. Disorders involving the colon include, but are not limited to, congenital anomalies, such as atresia and stenosis, Meckel diverticulum, congenital aganglionic megacolon-Hirschsprung disease; enterocolitis, such as diarrhea and dysentery, infectious enterocolitis, including viral gastroenteritis, bacterial enterocolitis, necrotizing enterocolitis, antibiotic-associated colitis (pseudomembranous colitis), and collagenous and lymphocytic colitis, miscellaneous intestinal inflammatory disorders, including parasites and protozoa, acquired immunodeficiency syndrome, transplantation, drug-induced intestinal injury, radiation enterocolitis, neutropenic colitis (typhlitis), and diversion colitis; idiopathic inflammatory bowel disease, such as Crohn disease and ulcerative colitis; tumors of the colon, such as non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors.

[5507] Moderate 46508 mRNA expression was also noted in normal and tumor prostate samples. Disorders involving the prostate include, but are not limited to, inflammations, benign enlargement, for example, nodular hyperplasia (benign prostatic hypertrophy or hyperplasia), and tumors such as carcinoma. As used herein, “a prostate disorder” refers to an abnormal condition occurring in the male pelvic region characterized by, e.g., male sexual dysfunction and/or urinary symptoms. This disorder may be manifested in the form of genitourinary inflammation (e.g., inflammation of smooth muscle cells) as in several common diseases of the http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h5http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h7prostate including prostatitis, benign prostatic hyperplasia and cancer, e.g., adenocarcinoma or carcinoma, of the http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h6http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h8prostate.

[5508] Moderate expression of 46508 mRNA was observed in fibrotic liver tissue samples. Thus, 46508 may be involved in disorders of the liver. Disorders involving the liver include, but are not limited to, hepatic injury; jaundice and cholestasis, such as bilirubin and bile formation; hepatic failure and cirrhosis, such as cirrhosis, portal hypertension, including ascites, portosystemic shunts, and splenomegaly; infectious disorders, such as viral hepatitis, including hepatitis A-E infection and infection by other hepatitis viruses, clinicopathologic syndromes, such as the carrier state, asymptomatic infection, acute viral hepatitis, chronic viral hepatitis, and fulminant hepatitis; autoimmune hepatitis; drug- and toxin-induced liver disease, such as alcoholic liver disease; inborn errors of metabolism and pediatric liver disease, such as hemochromatosis, Wilson disease, α_(l)-antitrypsin deficiency, and neonatal hepatitis; intrahepatic biliary tract disease, such as secondary biliary cirrhosis, primary biliary cirrhosis, primary sclerosing cholangitis, and anomalies of the biliary tree; circulatory disorders, such as impaired blood flow into the liver, including hepatic artery compromise and portal vein obstruction and thrombosis, impaired blood flow through the liver, including passive congestion and centrilobular necrosis and peliosis hepatis, hepatic vein outflow obstruction, including hepatic vein thrombosis (Budd-Chiari syndrome) and veno-occlusive disease; hepatic disease associated with pregnancy, such as preeclampsia and eclampsia, acute fatty liver of pregnancy, and intrehepatic cholestasis of pregnancy; hepatic complications of organ or bone marrow transplantation, such as drug toxicity after bone marrow transplantation, graft-versus-host disease and liver rejection, and nonimmunologic damage to liver allografts; tumors and tumorous conditions, such as nodular hyperplasias, adenomas, and malignant tumors, including primary carcinoma of the liver and metastatic tumors.

[5509] Moderate 46508 expression was also noted both normal and lung tumor samples, and thus 46508 may mediate disorders of the lung, e.g. lung disorders. Examples of disorders of the lung include, but are not limited to, congenital anomalies; atelectasis; diseases of vascular origin, such as pulmonary congestion and edema, including hemodynamic pulmonary edema and edema caused by microvascular injury, adult respiratory distress syndrome (diffuse alveolar damage), pulmonary embolism, hemorrhage, and infarction, and pulmonary hypertension and vascular sclerosis; chronic obstructive pulmonary disease, such as emphysema, chronic bronchitis, bronchial asthma, and bronchiectasis; diffuse interstitial (infiltrative, restrictive) diseases, such as pneumoconioses, sarcoidosis, idiopathic pulmonary fibrosis, desquamative interstitial pneumonitis, hypersensitivity pneumonitis, pulmonary eosinophilia (pulmonary infiltration with eosinophilia), Bronchiolitis obliterans-organizing pneumonia, diffuse pulmonary hemorrhage syndromes, including Goodpasture syndrome, idiopathic pulmonary hemosiderosis and other hemorrhagic syndromes, pulmonary involvement in collagen vascular disorders, and pulmonary alveolar proteinosis; complications of therapies, such as drug-induced lung disease, radiation-induced lung disease, and lung transplantation; tumors, such as bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[5510] The 46508 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO: 102 thereof are collectively referred to as “polypeptides or proteins of the invention” or “46508 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “46508 nucleic acids.” 46508 molecules refer to 46508 nucleic acids, polypeptides, and antibodies.

[5511] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[5512] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[5513] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[5514] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO:101 or SEQ ID NO:103, corresponds to a naturally-occurring nucleic acid molecule.

[5515] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein. As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding a 46508 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 46508 protein or derivative thereof.

[5516] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 46508 protein is at least 10% pure. In a preferred embodiment, the preparation of 46508 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-46508 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-46508 chemicals. When the 46508 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[5517] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 46508 without abolishing or substantially altering a 46508 activity. Preferably the alteration does not substantially alter the 46508 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 46508, results in abolishing a 46508 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 46508 are predicted to be particularly unamenable to alteration.

[5518] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 46508 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 46508 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 46508 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO: 101 or SEQ ID NO: 103, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[5519] As used herein, a “biologically active portion” of a 46508 protein includes a fragment of a 46508 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 46508 molecule and a non-46508 molecule or between a first 46508 molecule and a second 46508 molecule (e.g., a dimerization interaction). Biologically active portions of a 46508 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 46508 protein, e.g., the amino acid sequence shown in SEQ ID NO: 102, which include less amino acids than the full length 46508 proteins, and exhibit at least one activity of a 46508 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 46508 protein, e.g., hydrolase activity towards peptidyl-tRNA substrates, binding activity specific for ribonucleic acids, and binding activity specific for peptides. A biologically active portion of a 46508 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 46508 protein can be used as targets for developing agents which modulate a 46508 mediated activity, e.g., peptidyl-tRNA hydrolase activity, RNA binding activity, and peptide binding activity.

[5520] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[5521] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[5522] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[5523] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[5524] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[5525] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 46508 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 46508 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[5526] Particularly preferred 46508 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO: 102. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO: 102 are termed substantially identical.

[5527] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:101 or 103 are termed substantially identical.

[5528] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[5529] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[5530] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[5531] Various aspects of the invention are described in further detail below.

[5532] Isolated Nucleic Acid Molecules of 46508

[5533] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 46508 polypeptide described herein, e.g., a full-length 46508 protein or a fragment thereof, e.g., a biologically active portion of 46508 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 46508 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[5534] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:101, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 46508 protein (i.e., “the coding region” of SEQ ID NO:101, as shown in SEQ ID NO: 103), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:101 (e.g., SEQ ID NO:103) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein from about amino acid 44 to 221 of SEQ ID NO:102.

[5535] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:101 or SEQ ID NO:103, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:101 or SEQ ID NO:103, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:101 or 103, thereby forming a stable duplex.

[5536] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about: 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:101 or SEQ ID NO:103, or a portion, preferably of the same length, of any of these nucleotide sequences.

[5537] 46508 Nucleic Acid Fragments

[5538] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO: 101 or 103. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of a 46508 protein, e.g., an immunogenic or biologically active portion of a 46508 protein. A fragment can comprise those nucleotides of SEQ ID NO:101, which encode 44 to about 221 of SEQ ID NO: 102, which encompasses a peptidyl-tRNA hydrolase domain of human 46508. The nucleotide sequence determined from the cloning of the 46508 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 46508 family members, or fragments thereof, as well as 46508 homologues, or fragments thereof, from other species.

[5539] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 20 amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[5540] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 46508 nucleic acid fragment can include a sequence corresponding to a peptidyl-tRNA hydrolase domain at locations in the translated 46508 polypeptide described herein.

[5541] 46508 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO: 101 or SEQ ID NO:103, or of a naturally occurring allelic variant or mutant of SEQ ID NO:101 or SEQ ID NO:103.

[5542] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[5543] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes for example, a peptidyl-tRNA hydrolase domain from about amino acid 44 to about 221 of SEQ ID NO:102.:

[5544] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 46508 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a peptidyl-tRNA hydrolase domain from about amino acid 44 to 221 of SEQ ID NO:102.

[5545] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[5546] A nucleic acid fragment encoding a “biologically active portion of a 46508 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:101 or 103, which encodes a polypeptide having a 46508 biological activity (e.g., the biological activities of the 46508 proteins are described herein), expressing the encoded portion of the 46508 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 46508 protein. For example, a nucleic acid fragment encoding a biologically active portion of 46508 includes a peptidyl-tRNA hydrolase domain, e.g., amino acid residues about 44 to about 221 of SEQ ID NO: 102. A nucleic acid fragment encoding a biologically active portion of a 46508 polypeptide, may comprise a nucleotide sequence which is greater than 300 or more nucleotides in length.

[5547] In preferred embodiments, the nucleic acid fragment includes a nucleotide sequence that is other than, e.g., differs by at least one, two, three of more nucleotides from, the sequence of AW665597 or AA479713.

[5548] In preferred embodiments, the fragment comprises the coding region of 46508, e.g., the nucleotide sequence of SEQ ID NO: 103.

[5549] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1180, 1182 nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:101, or SEQ ID NO:103.

[5550] 46508 Nucleic Acid Variants

[5551] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:101 or SEQ ID NO:103. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 46508 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:102. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[5552] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[5553] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[5554] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:101 or 103, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[5555] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO: 102 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO: 102 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 46508 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 46508 gene.

[5556] Preferred variants include those that are correlated with modulating (stimulating and/or enhancing or inhibiting) cellular proliferation, differentiation, or tumorigenesis; modulating metabolism; modulating the course of viral and/or bacterial infection; and modulating cellular responses to environmental stress and toxins.

[5557] Allelic variants of 46508, e.g., human 46508, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 46508 protein within a population that maintain the ability to bind tRNAs, short peptides, peptidyl-tRNAs, and to catalyze peptidyl-tRNA hydrolysis. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:102, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 46508, e.g., human 46508, protein within a population that do not have the ability to bind tRNAs, short peptides, peptidyl-tRNAs, and to catalyze peptidyl-tRNA hydrolysis. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO: 102, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[5558] Moreover, nucleic acid molecules encoding other 46508 family members and, thus, which have a nucleotide sequence which differs from the 46508 sequences of SEQ ID NO:101 or SEQ ID NO:103 are intended to be within the scope of the invention.

[5559] Antisense Nucleic Acid Molecules, Ribozymes and Modified 46508 Nucleic Acid Molecules

[5560] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 46508. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 46508 coding strand, or to only a portion thereof (e.g., the coding region of human 46508 corresponding to SEQ ID NO:103). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 46508 (e.g., the 5′ and 3′untranslated regions).

[5561] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 46508 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 46508 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 46508 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[5562] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[5563] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 46508 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[5564] In yet another embodiment, the antisense nucleic acid molecule of the invention is an α-anomeric nucleic acid molecule. An α-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual β-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[5565] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 46508-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 46508 cDNA disclosed herein (i.e., SEQ ID NO:101 or SEQ ID NO:103), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 46508-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 46508 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[5566] 46508 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 46508 (e.g., the 46508 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 46508 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[5567] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or colorimetric.

[5568] A 46508 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[5569] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[5570] PNAs of 46508 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 46508 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[5571] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[5572] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 46508 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 46508 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[5573] Isolated 46508 Polypeptides

[5574] In another aspect, the invention features, an isolated 46508 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-46508 antibodies. 46508 protein can be isolated from cells or tissue sources using standard protein purification techniques. 46508 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[5575] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[5576] In a preferred embodiment, A 46508 polypeptide has one or more of the following characteristics:

[5577] (i) it has the ability to bind tRNA, and peptidyl-tRNAs;

[5578] (ii) it catalyzes the hydrolysis of peptidyl-tRNAs.

[5579] (iii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of SEQ ID NO: 102;

[5580] (iv) it has an overall sequence similarity of at least 60%, more preferably at least 70, 80, 90, or 95%, with a polypeptide of SEQ ID NO: 102;

[5581] (v) it has a peptidyl-tRNA hydrolase domain with a sequence homology which is preferably about 70%, 80%, 90% or 95% with amino acid residues about 44 to about 221 of SEQ ID NO:102;

[5582] (vii) it has at least one, preferably two, and most preferably three of the conserved residues found at positions 51, 109, and 155 of SEQ ID NO: 102; or

[5583] (viii) it has at least one, preferably two, and most preferable three of the conserved residues, histidine, aspartic acid, and arginine, respectively, located at positions 59, 134, and 173 of SEQ ID NO: 102.

[5584] In a preferred embodiment the 46508 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:102. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:102 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO: 102. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the peptidyl-tRNA hydrolase domain. In another preferred embodiment one or more differences are in the peptidyl-tRNA hydrolase domain.

[5585] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 46508 proteins differ in amino acid sequence from SEQ ID NO: 102, yet retain biological activity.

[5586] In one embodiment, the protein includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:102.

[5587] A 46508 protein or fragment is provided which varies from the sequence of SEQ ID NO:102 in regions defined by amino acids about 1 to about 43, and from about 85 to about 97, by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO: 102 in regions defined by amino acids about 44 to about 84, and from about 98 to about 217. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[5588] In one embodiment, a biologically active portion of a 46508 protein includes a one peptidyl-tRNA hydrolase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 46508 protein.

[5589] In a preferred embodiment, the 46508 protein has an amino acid sequence shown in SEQ ID NO:102. In other embodiments, the 46508 protein is substantially identical to SEQ ID NO:102. In yet another embodiment, the 46508 protein is substantially identical to SEQ ID NO: 102 and retains the functional activity of the protein of SEQ ID NO: 102, as described in detail in the subsections above.

[5590] 46508 Chimeric or Fusion Proteins

[5591] In another aspect, the invention provides 46508 chimeric or fusion proteins. As used herein, a 46508 “chimeric protein” or “fusion protein” includes a 46508 polypeptide linked to a non-46508 polypeptide. A “non-46508 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 46508 protein, e.g., a protein which is different from the 46508 protein and which is derived from the same or a different organism. The 46508 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 46508 amino acid sequence. In a preferred embodiment, a 46508 fusion protein includes at least one (or two) biologically active portion of a 46508 protein. The non-46508 polypeptide can be fused to the N-terminus or C-terminus of the 46508 polypeptide.

[5592] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-46508 fusion protein in which the 46508 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 46508. Alternatively, the fusion protein can be a 46508 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 46508 can be increased through use of a heterologous signal sequence.

[5593] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[5594] The 46508 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 46508 fusion proteins can be used to affect the bioavailability of a 46508 substrate. 46508 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 46508 protein; (ii) mis-regulation of the 46508 gene; and (iii) aberrant post-translational modification of a 46508 protein.

[5595] Moreover, the 46508-fusion proteins of the invention can be used as immunogens to produce anti-46508 antibodies in a subject, to purify 46508 ligands and in screening assays to identify molecules which inhibit the interaction of 46508 with a 46508 substrate.

[5596] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 46508-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 46508 protein.

[5597] Variants of 46508 Proteins

[5598] In another aspect, the invention also features a variant of a 46508 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 46508 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 46508 protein. An agonist of the 46508 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 46508 protein. An antagonist of a 46508 protein can inhibit one or more of the activities of the naturally occurring form of the 46508 protein by, for example, competitively modulating a 46508-mediated activity of a 46508 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 46508 protein.

[5599] Variants of a 46508 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 46508 protein for agonist or antagonist activity.

[5600] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 46508 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 46508 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[5601] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 46508 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 46508 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[5602] Cell based assays can be exploited to analyze a variegated 46508 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 46508 in a substrate-dependent manner. The transfected cells are then contacted with 46508 and the effect of the expression of the mutant on signaling by the 46508 substrate can be detected. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 46508 substrate, and the individual clones further characterized.

[5603] In another aspect, the invention features a method of making a 46508 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 46508 polypeptide, e.g., a naturally occurring 46508 polypeptide. The method includes: altering the sequence of a 46508 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[5604] In another aspect, the invention features a method of making a fragment or analog of a 46508 polypeptide a biological activity of a naturally occurring 46508 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 46508 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[5605] Anti-46508 Antibodies

[5606] In another aspect, the invention provides an anti-46508 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[5607] The anti-46508 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[5608] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the

[5609] COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[5610] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 46508 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-46508 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)₂ fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[5611] The anti-46508 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[5612] Phage display and combinatorial methods for generating anti-46508 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[5613] In one embodiment, the anti-46508 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[5614] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[5615] An anti-46508 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[5616] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[5617] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 46508 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[5618] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[5619] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 46508 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[5620] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[5621] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[5622] In preferred embodiments an antibody can be made by immunizing with purified 46508 antigen, or a fragment thereof, e.g., a fragment described herein.

[5623] A full-length 46508 protein or, antigenic peptide fragment of 46508 can be used as an immunogen or can be used to identify anti-46508 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 46508 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO: 102 and encompasses an epitope of 46508. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[5624] Antibodies can be made against the peptidyl-tRNA hydrolase domain of 46508. Fragments of 46508 which include residues from about 77 to about 85, and from about 217 to about 224 of SEQ ID NO:102 can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody, antibodies against hydrophilic regions of the 46508 protein. Similarly, a fragment of 46508 which include residues from about 60 to 70, from about 86 to 102, and from about 189 to 195 of SEQ ID NO:102 can be used to make an antibody against a hydrophobic region of the 46508 protein. Similarly, a fragment of 46508 which includes residues about 44 to 221 of SEQ ID NO:102 (or a fragment thereof, e.g., 44-100, 100-150, 150-200, 200-221 of SEQ ID NO:102) can be used to make an antibody against the peptidyl-tRNA hydrolase region of the 46508 protein.

[5625] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[5626] Antibodies which bind only native 46508 protein, only denatured or otherwise non-native 46508 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 46508 protein.

[5627] Preferred epitopes encompassed by the antigenic peptide are regions of 46508 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 46508 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 46508 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[5628] The anti-46508 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 46508 protein.

[5629] In a preferred embodiment the antibody has: effector function; and can fix complement. In other embodiments the antibody does not; recruit effector cells; or fix complement.

[5630] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example., it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[5631] In a preferred embodiment, an anti-46508 antibody alters (e.g., increases or decreases) the peptidyl-tRNA hydrolase activity of a 46508 polypeptide. For example, the antibody can bind at or in proximity to the active site, e.g., to an epitope that includes a residue located from about 44 to 221 of SEQ ID NO: 102.

[5632] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[5633] An anti-46508 antibody (e.g., monoclonal antibody) can be used to isolate 46508 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-46508 antibody can be used to detect 46508 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-46508 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or ³H.

[5634] The invention also includes a nucleic acids which encodes an anti-46508 antibody, e.g., an anti-46508 antibody described herein. Also included are vectors which include the nucleic acid and sells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[5635] The invention also includes cell lines, e.g., hybridomas, which make an anti-46508 antibody, e.g., and antibody described herein, and method of using said cells to make a 46508 antibody.

[5636] 46508 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[5637] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[5638] A vector can include a 46508 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 46508 proteins, mutant forms of 46508 proteins, fusion proteins, and the like).

[5639] The recombinant expression vectors of the invention can be designed for expression of 46508 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[5640] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[5641] Purified fusion proteins can be used in 46508 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 46508 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[5642] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[5643] The 46508 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[5644] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[5645] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[5646] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the α-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[5647] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[5648] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 46508 nucleic acid molecule within a recombinant expression vector or a 46508 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[5649] A host cell can be any prokaryotic or eukaryotic cell. For example, a 46508 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells (African green monkey kidney cells CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182)). Other suitable host cells are known to those skilled in the art.

[5650] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[5651] A host cell of the invention can be used to produce (i.e., express) a 46508 protein. Accordingly, the invention further provides methods for producing a 46508 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 46508 protein has been introduced) in a suitable medium such that a 46508 protein is produced. In another embodiment, the method further includes isolating a 46508 protein from the medium or the host cell.

[5652] In another aspect, the invention features, a cell or purified preparation of cells which include a 46508 transgene, or which otherwise misexpress 46508. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 46508 transgene, e.g., a heterologous form of a 46508, e.g., a gene derived from humans (in the case of a non-human cell). The 46508 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 46508, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 46508 alleles or for use in drug screening.

[5653] In another aspect, the invention features, a human cell, e.g., a hematopoietic stem cell, transformed with nucleic acid which encodes a subject 46508 polypeptide.

[5654] Also provided are cells, preferably human cells, e.g., human hematopoietic or fibroblast cells, in which an endogenous 46508 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 46508 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 46508 gene. For example, an endogenous 46508 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[5655] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 46508 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 46508 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 46508 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[5656] 46508 Transgenic Animals

[5657] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 46508 protein and for identifying and/or evaluating modulators of 46508 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 46508 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[5658] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 46508 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 46508 transgene in its genome and/or expression of 46508 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 46508 protein can further be bred to other transgenic animals carrying other transgenes.

[5659] 46508 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[5660] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[5661] Uses of 46508

[5662] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[5663] The isolated nucleic acid molecules of the invention can be used, for example, to express a 46508 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 46508 mRNA (e.g., in a biological sample) or a genetic alteration in a 46508 gene, and to modulate 46508 activity, as described further below. The 46508 proteins can be used to treat disorders characterized by insufficient or excessive production of a 46508 substrate or production of 46508 inhibitors. In addition, the 46508 proteins can be used to screen for naturally occurring 46508 substrates, to screen for drugs or compounds which modulate 46508 activity, as well as to treat disorders characterized by insufficient or excessive production of 46508 protein or production of 46508 protein forms which have decreased, aberrant or unwanted activity compared to 46508 wild type protein (e.g., a cell proliferative or differentiative disorder, viral infection, metabolic disorder). Moreover, the anti-46508 antibodies of the invention can be used to detect and isolate 46508 proteins, regulate the bioavailability of 46508 proteins, and modulate 46508 activity.

[5664] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 46508 polypeptide is provided. The method includes: contacting the compound with the subject 46508 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 46508 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 46508 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 46508 polypeptide. Screening methods are discussed in more detail below.

[5665] 46508 Screening Assays

[5666] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 46508 proteins, have a stimulatory or inhibitory effect on, for example, 46508 expression or 46508 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 46508 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 46508 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[5667] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of a 46508 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 46508 protein or polypeptide or a biologically active portion thereof.

[5668] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[5669] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[5670] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[5671] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 46508 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 46508 activity is determined. Determining the ability of the test compound to modulate 46508 activity can be accomplished by monitoring, for example, hydrolytic activity. The cell, for example, can be of mammalian origin, e.g., human.

[5672] The ability of the test compound to modulate 46508 binding to a compound, e.g., a 46508 substrate, or to bind to 46508 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 46508 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 46508 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 46508 binding to a 46508 substrate in a complex. For example, compounds (e.g., 46508 substrates) can be labeled with ¹²⁵I, ³⁵S, ¹⁴C, or ³H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[5673] The ability of a compound (e.g., a 46508 substrate) to interact with 46508 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 46508 without the labeling of either the compound or the 46508. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 46508.

[5674] In yet another embodiment, a cell-free assay is provided in which a 46508 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 46508 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 46508 proteins to be used in assays of the present invention include fragments which participate in interactions with non-46508 molecules, e.g., fragments with high surface probability scores.

[5675] Soluble and/or membrane-bound forms of isolated proteins (e.g., 46508 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)_(n), 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[5676] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[5677] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[5678] In another embodiment, determining the ability of the 46508 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[5679] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[5680] It may be desirable to immobilize either 46508, an anti-46508 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 46508 protein, or interaction of a 46508 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/46508 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 46508 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 46508 binding or activity determined using standard techniques.

[5681] Other techniques for immobilizing either a 46508 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 46508 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[5682] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[5683] In one embodiment, this assay is performed utilizing antibodies reactive with 46508 protein or target molecules but which do not interfere with binding of the 46508 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 46508 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 46508 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 46508 protein or target molecule.

[5684] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11: 141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[5685] In a preferred embodiment, the assay includes contacting the 46508 protein or biologically active portion thereof with a known compound which binds 46508 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 46508 protein, wherein determining the ability of the test compound to interact with a 46508 protein includes determining the ability of the test compound to preferentially bind to 46508 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[5686] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 46508 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 46508 protein through modulation of the activity of a downstream effector of a 46508 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[5687] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[5688] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[5689] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[5690] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[5691] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[5692] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[5693] In yet another aspect, the 46508 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 46508 (“46508-binding proteins” or “46508-bp”) and are involved in 46508 activity. Such 46508-bps can be activators or inhibitors of signals by the 46508 proteins or 46508 targets as, for example, downstream elements of a 46508-mediated signaling pathway.

[5694] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 46508 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 46508 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 46508-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 46508 protein.

[5695] In another embodiment, modulators of 46508 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 46508 mRNA or protein evaluated relative to the level of expression of 46508 mRNA or protein in the absence of the candidate compound. When expression of 46508 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 46508 mRNA or protein expression. Alternatively, when expression of 46508 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 46508 mRNA or protein expression. The level of 46508 mRNA or protein expression can be determined by methods described herein for detecting 46508 mRNA or protein.

[5696] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 46508 protein can be confirmed in vivo, e.g., in an animal such as an animal model for cancer or viral disease.

[5697] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 46508 modulating agent, an antisense 46508 nucleic acid molecule, a 46508-specific antibody, or a 46508-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[5698] 46508 Detection Assays

[5699] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 46508 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[5700] 46508 Chromosome Mapping

[5701] The 46508 nucleotide sequences or portions thereof can be used to map the location of the 46508 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 46508 sequences with genes associated with disease.

[5702] Briefly, 46508 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 46508 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 46508 sequences will yield an amplified fragment.

[5703] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[5704] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 46508 to a chromosomal location.

[5705] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[5706] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[5707] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[5708] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 46508 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[5709] 46508 Tissue Typing

[5710] 46508 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[5711] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 46508 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[5712] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:101 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO: 103 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[5713] If a panel of reagents from 46508 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[5714] Use of Partial 46508 Sequences in Forensic Biology

[5715] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[5716] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:101 (e.g., fragments derived from the noncoding regions of SEQ ID NO: 101 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[5717] The 46508 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 46508 probes can be used to identify tissue by species and/or by organ type.

[5718] In a similar fashion, these reagents, e.g., 46508 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[5719] Predictive Medicine of 46508

[5720] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[5721] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 46508.

[5722] Such disorders include, e.g., a disorder associated with the misexpression of a 46508 molecule; a disorder of cell growth and proliferation (e.g., cancer); a disorder resulting from viral or bacterial infection; a disorder resulting from environmental toxins; a disorder of metabolism.

[5723] The method includes one or more of the following:

[5724] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 46508 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[5725] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 46508 gene;

[5726] detecting, in a tissue of the subject, the misexpression of the 46508 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[5727] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 46508 polypeptide.

[5728] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 46508 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[5729] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:101, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 46508 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[5730] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 46508 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 46508.

[5731] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[5732] In preferred embodiments the method includes determining the structure of a 46508 gene, an abnormal structure being indicative of risk for the disorder.

[5733] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 46508 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[5734] Diagnostic and Prognostic Assays of 46508

[5735] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 46508 molecules and for identifying variations and mutations in the sequence of 46508 molecules.

[5736] Expression Monitoring and Profiling:

[5737] The presence, level, or absence of 46508 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 46508 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 46508 protein such that the presence of 46508 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 46508 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 46508 genes; measuring the amount of protein encoded by the 46508 genes; or measuring the activity of the protein encoded by the 46508 genes.

[5738] The level of mRNA corresponding to the 46508 gene in a cell can be determined both by in situ and by in vitro formats.

[5739] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a fill-length 46508 nucleic acid, such as the nucleic acid of SEQ ID NO:101, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 46508 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[5740] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 46508 genes.

[5741] The level of mRNA in a sample that is encoded by one of 46508 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[5742] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 46508 gene being analyzed.

[5743] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 46508 mRNA, or genomic DNA, and comparing the presence of 46508 mRNA or genomic DNA in the control sample with the presence of 46508 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 46508 transcript levels.

[5744] A variety of methods can be used to determine the level of protein encoded by 46508. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)₂) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[5745] The detection methods can be used to detect 46508 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 46508 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 46508 protein include introducing into a subject a labeled anti-46508 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-46508 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[5746] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 46508 protein, and comparing the presence of 46508 protein in the control sample with the presence of 46508 protein in the test sample.

[5747] The invention also includes kits for detecting the presence of 46508 in a biological sample. For example, the kit can include a compound or agent capable of detecting 46508 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 46508 protein or nucleic acid.

[5748] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[5749] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[5750] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 46508 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as pain or deregulated cell proliferation.

[5751] In one embodiment, a disease or disorder associated with aberrant or unwanted 46508 expression or activity is identified. A test sample is obtained from a subject and 46508 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 46508 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 46508 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[5752] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 46508 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a cell proliferative disorder.

[5753] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 46508 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 46508 (e.g., other genes associated with a 46508-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[5754] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 46508 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose, for example, a cell differentiative or proliferative disorder in a subject wherein an increase in 46508 expression is an indication that the subject has or is disposed to having a disorder. The method can be used to monitor a treatment for, e.g. a cell proliferative or differentiative disorder in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[5755] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 46508 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[5756] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 46508 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[5757] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[5758] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 46508 expression.

[5759] 46508 Arrays and Uses Thereof

[5760] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 46508 molecule (e.g., a 46508 nucleic acid or a 46508 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm², and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[5761] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 46508 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 46508. Each address of the subset can include a capture probe that hybridizes to a different region of a 46508 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 46508 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 46508 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 46508 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[5762] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[5763] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 46508 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 46508 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-46508 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[5764] In another aspect, the invention features a method of analyzing the expression of 46508. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 46508-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[5765] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 46508. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 46508. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[5766] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 46508 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[5767] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[5768] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 46508-associated disease or disorder; and processes, such as a cellular transformation associated with a 46508-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 46508-associated disease or disorder

[5769] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 46508) that could serve as a molecular target for diagnosis or therapeutic intervention.

[5770] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 46508 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 46508 polypeptide or fragment thereof. For example, multiple variants of a 46508 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[5771] The polypeptide array can be used to detect a 46508 binding compound, e.g., an antibody in a sample from a subject with specificity for a 46508 polypeptide or the presence of a 46508-binding protein or ligand.

[5772] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 46508 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[5773] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 46508 or from a cell or subject in which a 46508 mediated response has been elicited, e.g., by contact of the cell with 46508 nucleic acid or protein, or administration to the cell or subject 46508 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 46508 (or does not express as highly as in the case of the 46508 positive plurality of capture probes) or from a cell or subject which in which a 46508 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 46508 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[5774] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 46508 or from a cell or subject in which a 46508-mediated response has been elicited, e.g., by contact of the cell with 46508 nucleic acid or protein, or administration to the cell or subject 46508 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 46508 (or does not express as highly as in the case of the 46508 positive plurality of capture probes) or from a cell or subject which in which a 46508 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[5775] In another aspect, the invention features a method of analyzing 46508, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 46508 nucleic acid or amino acid sequence; comparing the 46508 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 46508.

[5776] Detection of 46508 Variations or Mutations

[5777] The methods of the invention can also be used to detect genetic alterations in a 46508 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 46508 protein activity or nucleic acid expression, such as a cell proliferative or differentiative disorder. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 46508-protein, or the mis-expression of the 46508 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 46508 gene; 2) an addition of one or more nucleotides to a 46508 gene; 3) a substitution of one or more nucleotides of a 46508 gene, 4) a chromosomal rearrangement of a 46508 gene; 5) an alteration in the level of a messenger RNA transcript of a 46508 gene, 6) aberrant modification of a 46508 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 46508 gene, 8) a non-wild type level of a 46508-protein, 9) allelic loss of a 46508 gene, and 10) inappropriate post-translational modification of a 46508-protein.

[5778] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 46508-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 46508 gene under conditions such that hybridization and amplification of the 46508-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[5779] In another embodiment, mutations in a 46508 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[5780] In other embodiments, genetic mutations in 46508 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 46508 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 46508 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 46508 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[5781] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 46508 gene and detect mutations by comparing the sequence of the sample 46508 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[5782] Other methods for detecting mutations in the 46508 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[5783] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 46508 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[5784] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 46508 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 46508 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[5785] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[5786] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[5787] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[5788] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 46508 nucleic acid.

[5789] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:101 or the complement of SEQ ID NO:101. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[5790] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 46508. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[5791] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the T_(m) of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[5792] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 46508 nucleic acid.

[5793] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 46508 gene.

[5794] Use of 46508 Molecules as Surrogate Markers

[5795] The 46508 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 46508 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 46508 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[5796] The 46508 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 46508 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-46508 antibodies may be employed in an immune-based detection system for a 46508 protein marker, or 46508-specific radiolabeled probes may be used to detect a 46508 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[5797] The 46508 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 46508 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 46508 DNA may correlate 46508 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[5798] Pharmaceutical Compositions of 46508

[5799] The nucleic acid and polypeptides, fragments thereof, as well as anti-46508 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifingal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[5800] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[5801] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[5802] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[5803] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[5804] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[5805] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[5806] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[5807] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[5808] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[5809] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[5810] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[5811] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[5812] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[5813] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[5814] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[5815] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, CC-1065, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine, vinblastine, taxol and maytansinoids). Radioactive ions include, but are not limited to iodine, yttrium and praseodymium.

[5816] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, α-interferon, β-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[5817] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[5818] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[5819] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[5820] Methods of Treatment for 46508

[5821] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 46508 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[5822] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 46508 molecules of the present invention or 46508 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[5823] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 46508 expression or activity, by administering to the subject a 46508 or an agent which modulates 46508 expression or at least one 46508 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 46508 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 46508 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 46508 aberrance, for example, a 46508, 46508 agonist or 46508 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[5824] It is possible that some 46508 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[5825] The 46508 molecules can also act as novel diagnostic targets and therapeutic agents for controlling one or more of immune disorders, or metabolic disorders.

[5826] The 46508 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune (or inflammatory) disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[5827] As used herein, the term “erythroid associated disorders” include disorders involving aberrant (increased or deficient) erythroblast proliferation, e.g., an erythroleukemia, and aberrant (increased or deficient) erythroblast differentiation, e.g., an anemia. Erythrocyte-associated disorders include anemias such as, for example, drug- (chemotherapy-) induced anemias, hemolytic anemias due to hereditary cell membrane abnormalities, such as hereditary spherocytosis, hereditary elliptocytosis, and hereditary pyropoikilocytosis; hemolytic anemias due to acquired cell membrane defects, such as paroxysmal nocturnal hemoglobinuria and spur cell anemia; hemolytic anemias caused by antibody reactions, for example to the RBC antigens, or antigens of the ABO system, Lewis system, Ii system, Rh system, Kidd system, Duffy system, and Kell system; methemoglobinemia; a failure of erythropoiesis, for example, as a result of aplastic anemia, pure red cell aplasia, myelodysplastic syndromes, sideroblastic anemias, and congenital dyserythropoietic anemia; secondary anemia in non-hematolic disorders, for example, as a result of chemotherapy, alcoholism, or liver disease; anemia of chronic disease, such as chronic renal failure; and endocrine deficiency diseases.

[5828] Additionally, 46508 may play an important role in the regulation of metabolism. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes.

[5829] As discussed, successful treatment of 46508 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 46508 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)₂ and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[5830] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[5831] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[5832] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 46508 expression is through the use of aptamer molecules specific for 46508 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 46508 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[5833] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 46508 disorders. For a description of antibodies, see the Antibody section above.

[5834] In circumstances wherein injection of an animal or a human subject with a 46508 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 46508 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 46508 protein. Vaccines directed to a disease characterized by 46508 expression may also be generated in this fashion.

[5835] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[5836] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 46508 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[5837] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED₅₀ with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC₅₀ (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography.

[5838] Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 46508 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 46508 can be readily monitored and used in calculations of IC₅₀.

[5839] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC₅₀. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[5840] Another aspect of the invention pertains to methods of modulating 46508 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 46508 or agent that modulates one or more of the activities of 46508 protein activity associated with the cell. An agent that modulates 46508 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 46508 protein (e.g., a 46508 substrate or receptor), a 46508 antibody, a 46508 agonist or antagonist, a peptidomimetic of a 46508 agonist or antagonist, or other small molecule.

[5841] In one embodiment, the agent stimulates one or 46508 activities. Examples of such stimulatory agents include active 46508 protein and a nucleic acid molecule encoding 46508. In another embodiment, the agent inhibits one or more 46508 activities. Examples of such inhibitory agents include antisense 46508 nucleic acid molecules, anti-46508 antibodies, and 46508 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 46508 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 46508 expression or activity. In another embodiment, the method involves administering a 46508 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 46508 expression or activity.

[5842] Stimulation of 46508 activity is desirable in situations in which 46508 is abnormally downregulated and/or in which increased 46508 activity is likely to have a beneficial effect. For example, stimulation of 46508 activity is desirable in situations in which a 46508 is downregulated and/or in which increased 46508 activity is likely to have a beneficial effect. Likewise, inhibition of 46508 activity is desirable in situations in which 46508 is abnormally upregulated and/or in which decreased 46508 activity is likely to have a beneficial effect.

[5843] 46508 Pharmacogenomics

[5844] The 46508 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 46508 activity (e.g., 46508 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 46508 associated disorders (e.g., cancer) associated with aberrant or unwanted 46508 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 46508 molecule or 46508 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 46508 molecule or 46508 modulator.

[5845] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[5846] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[5847] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 46508 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[5848] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 46508 molecule or 46508 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[5849] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 46508 molecule or 46508 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[5850] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 46508 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 46508 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[5851] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 46508 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 46508 gene expression, protein levels, or upregulate 46508 activity, can be monitored in clinical trials of subjects exhibiting decreased 46508 gene expression, protein levels, or downregulated 46508 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 46508 gene expression, protein levels, or downregulate 46508 activity, can be monitored in clinical trials of subjects exhibiting increased 46508 gene expression, protein levels, or upregulated 46508 activity. In such clinical trials, the expression or activity of a 46508 gene, and preferably, other genes that have been implicated in, for example, a 46508-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[5852] 46508 Informatics

[5853] The sequence of a 46508 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 46508. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 46508 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[5854] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[5855] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[5856] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[5857] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[5858] Thus, in one aspect, the invention features a method of analyzing 46508, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 46508 nucleic acid or amino acid sequence; comparing the 46508 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 46508. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[5859] The method can include evaluating the sequence identity between a 46508 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[5860] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[5861] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[5862] Thus, the invention features a method of making a computer readable record of a sequence of a 46508 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[5863] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 46508 sequence, or record, in machine-readable form; comparing a second sequence to the 46508 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 46508 sequence includes a sequence being compared. In a preferred embodiment the 46508 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 46508 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[5864] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 46508-associated disease or disorder or a pre-disposition to a 46508-associated disease or disorder, wherein the method comprises the steps of determining 46508 sequence information associated with the subject and based on the 46508 sequence information, determining whether the subject has a 46508-associated disease or disorder or a pre-disposition to a 46508-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[5865] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 46508-associated disease or disorder or a pre-disposition to a disease associated with a 46508 wherein the method comprises the steps of determining 46508 sequence information associated with the subject, and based on the 46508 sequence information, determining whether the subject has a 46508-associated disease or disorder or a pre-disposition to a 46508-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 46508 sequence of the subject to the 46508 sequences in the database to thereby determine whether the subject as a 46508-associated disease or disorder, or a pre-disposition for such.

[5866] The present invention also provides in a network, a method for determining whether a subject has a 46508 associated disease or disorder or a pre-disposition to a 46508-associated disease or disorder associated with 46508, said method comprising the steps of receiving 46508 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 46508 and/or corresponding to a 46508-associated disease or disorder and based on one or more of the phenotypic information, the 46508 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 46508-associated disease or disorder or a pre-disposition to a 46508-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[5867] The present invention also provides a method for determining whether a subject has a 46508-associated disease or disorder or a pre-disposition to a 46508-associated disease or disorder, said method comprising the steps of receiving information related to 46508 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 46508 and/or related to a 46508-associated disease or disorder, and based on one or more of the phenotypic information, the 46508 information, and the acquired information, determining whether the subject has a 46508-associated disease or disorder or a pre-disposition to a 46508-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[5868] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

EXAMPLES Examples for 26443 and 46873 Example 1 Identification and Characterization of Human 26443 or 46873 cDNA

[5869] The human 26443 sequence (FIG. 1; SEQ ID NO:1), which is approximately 1888 nucleotides long, including 5′ and 3′untranslated regions, contains a predicted methionine-initiated coding sequence of about 1254 nucleotides (SEQ ID NO:3 and nucleotides 91 to 1344 of SEQ ID NO:1). The coding sequence encodes an 418 amino acid protein (SEQ ID NO:2).

[5870] The human 46873 sequence (FIG. 5; SEQ ID NO:4), which is approximately 1358 nucleotides long, including 5′ and 3′untranslated regions, contains a predicted methionine-initiated coding sequence of about 924 nucleotides (SEQ ID NO:3 and nucleotides 134 to 1057 of SEQ ID NO:1). The coding sequence encodes an 308 amino acid protein (SEQ ID NO:2).

Example 2 Tissue Distribution of 26443 or 46873 mRNA by Large-Scale Tissue-Specific Library Sequencing and by Northern Blot Hybridization

[5871] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2×SSC at 65° C. A DNA probe corresponding to all or a portion of the 26443 or 46873 cDNA (SEQ ID NO: 1 or SEQ ID NO:4, respectively) can be used. The DNA was radioactively labeled with ³²P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse liver, hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 3 Recombinant Expression of 26443 or 46873 in Bacterial Cells

[5872] In this example, 26443 or 46873 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 26443 or 46873 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-26443 or -46873 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 4 Expression of Recombinant 26443 or 46873 Protein in COS Cells

[5873] To express the 26443 or 46873 gene in COS cells, the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 26443 or 46873 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5874] To construct the plasmid, the 26443 or 46873 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 26443 or 46873 coding sequence starting from the initiation codon; the 3′end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 26443 or 46873 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably, the two restriction sites chosen are different so that the 26443 or 46873 gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5α, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5875] COS cells are subsequently transfected with the 26443- or 46873-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989. The expression of the 26443 or 46873 polypeptide is detected by radiolabeling (³⁵S-methionine or ³⁵S-cysteine, available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1988) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with ³⁵S-methionine (or ³⁵S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5876] Alternatively, DNA containing the 26443 or 46873 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 26443 or 46873 polypeptide is detected by radiolabeling and immunoprecipitation using a 26443 or 46873 specific monoclonal antibody.

Examples for 61833 Example 5 Identification and Characterization of Human 61833 cDNA

[5877] The human 61833 nucleic acid sequence is recited as follows: GGCTGCTGGGCTGGCGGGGCGCAGGCCGCGGGACCCGAGCCCGGGGAAGCGAG (SEQ ID NO: 10) AGAGCGGAGGCGCCGAGGATCCGATTCACTCCCTGGGGAGACCTATGGGCCGAA GCCGTGTAAATGCGTTTTAAGCAGAGGCCTCGGCTCCGCAACTGCCACTCCTCCT CGGGGTGTTGCACAAGTTTCGAGGTCACCGGCGACCCCCCCTAGCAGCGCGCCT GGCTCTGGCCCCCGCGAAGGAGGACGGAGTTTGTGTGTTGCATACTTTCTAAGGC GGCGGCTGCAGCAGCGGCTCCATCCAGCCCGTCAGCTCCTCCTGCAAGGCATGG CTGGCTACCTGAGTGAATCGGACTTTGTGATGGTGGAGGAGGGCTTCAGTACCC GAGACCTGCTGAAGGAACTCACTCTGGGGGCCTCACAGGCCACCACGGACGAGG TAGCTGCCTTCTTCGTGGCTGACCTGGGTGCCATAGTGAGGAAGCACTTTTGCTT TCTGAAGTGCCTGCCACGAGTCCGGCCCTTTTATGCTGTCAAGTGCAACAGCAGC CCAGGTGTGCTGAAGGTTCTGGCCCAGCTGGGGCTGGGCTTTAGCTGTGCCAACA AGGCAGAGATGGAGTTGGTCCAGCATATTGGAATCCCTGCCAGTAAGATCATCT GCGCCAACCCCTGTAAGCAAATTGCACAGATCAAATATGCTGCCAAGCATGGGA TCCAGCTGCTGAGCTTTGACAATGAGATGGAGCTGGCAAAGGTGGTAAAGAGCC ACCCCAGTGCCAAGATGGTTCTGTGCATTGCTACCGATGACTCCCACTCCCTGAG CTGCCTGAGCCTAAAGTTTGGAGTGTCACTGAAATCCTGCAGACACCTGCTTGAA AATGCGAAGAAGCACCATGTGGAGGTGGTGGGTGTGAGTTTTCACATTGGCAGT GGCTGTCCTGACCCTCAGGCCTATGCTCAGTCCATCGCAGACGCCCGGCTCGTGT TTGAAATGGGCACCGAGCTGGGTCACAAGATGCACGTTCTGGACCTTGGTGGTG GCTTCCCTGGCACAGAAGGGGCCAAAGTGAGATTTGAAGAGATTGCTTCCGTGA TCAACTCAGCCTTGGACCTGTACTTCCCAGAGGGCTGTGGCGTGGACATCTTTGC TGAGCTGGGGCGCTACTACGTGACCTCGGCCTTCACTGTGGCAGTCAGCATCATT GCCAAGAAGGAGGTTCTGCTAGACCAGCCTGGCAGGGAGGAGGAAAATGGTTCC ACCTCAAGACCCATCGTGTACCACCTTGATGAGGGCGTGTATGGGATCTTCAACT CAGTCCTGTTTGACAACATCTGCCCTACCCCCATCCTGCAGAAGAAACCATCCAC GGAGCAGCCCCTGTACAGCAGCAGCCTGTGGGGCCCGGCGGTTGATGGCTGTGA TTGCGTGGCTGAGGGCCTGTGGCTGCCGCAACTACACGTAGGGGACTGGCTGGT CTTTGACAACATGGGCGCCTACACTGTGGGCATGGGTTCCCCCTTTTGGGGGACC CAGGCCTGCCACATCACCTATGCCATGTCCCGGGTGGCCTGGGAAGCGCTGCGA AGGCAGCTGATGGCTGCAGAACAGGAGGATGACGTGGAGGGTGTGTGCAAGCCT CTGTCCTGCGGCTGGGAGATCACAGACACCCTGTGCGTGGGCCCTGTCTTCACCC CAGCGAGCATCATGTGAGTGGGCCTCGTTCCCCCCGGAGAATCCCAGCGGGGCC TCAGAGATGCATCTGGGAGAGGTGGGCGAGGCAGCGAGCTGGTACCCTCTGGCC AGGACTTCTGGTGCTCGCTCTGCCGCCCCACGCTCCACCTGTAGTGTTTCTGCCCT GTAAATAGGACCAGTCTTACACTCGCTTGTAGTTTCAAGTATGCAACATAAATCC TGTCCCTTCCAAAAAAAAAAAAAAAAAAAAA.

[5878] The human 61833 sequence (FIG. 9; SEQ ID NO:10) is approximately 1937 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TGA) which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 1383 nucleotides (nucleotides indicated as “coding” of SEQ ID NO: 10; SEQ ID NO: 12). The coding sequence encodes a 460 amino acid protein (SEQ ID NO:11), which is recited as follows: MAGYLSESDFVMVEEGFSTRDLLKELTLGASQATTDEVAAFFVADLGAIVRKHFCF (SEQ ID NO: 11) LKCLPRVRPFYAVKCNSSPGVLKVLAQLGLGFSCANKAEMELVQHIGIPASKIICANP CKQIAQIKYAAKHGIQLLSFDNEMELAKVVKSHPSAKMVLCIATDDSHSLSCLSLKF GVSLKSCRHLLENAKKHHVEVVGVSFHIGSGCPDPQAYAQSIADARLVFEMGTELG HKMHVLDLGGGFPGTEGAKVRFEEIASVINSALDLYFPEGCGVDIFAELGRYYVTSA FTVAVSIIAKKEVLLDQPGREEENGSTSRPIVYHLDEGVYGIFNSVLFDNICPTPILQK KPSTEQPLYSSSLWGPAVDGCDCVAEGLWLPQLHVGDWLVFDNMGAYTVGMGSP FWGTQACHITYAMSRVAWEALRRQLMAAEQEDDVEGVCKPLSCGWEITDTLCVGP VFTPASIM.

Example 6 Tissue Distribution of 61833 mRNA by TaqMan Analysis

[5879] Endogenous human 61833 gene expression can be determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

Example 7 Tissue Distribution of 61833 mRNA by Northern Analysis

[5880] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2× SSC at 65° C. A DNA probe corresponding to all or a portion of the 61833 cDNA (SEQ ID NO:10) can be used. The DNA was radioactively labeled with ³²P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 8 Recombinant Expression of 61833 in Bacterial Cells

[5881] In this example, 61833 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 61833 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB 199. Expression of the GST-61833 fusion protein in PEB 199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 9 Expression of Recombinant 61833 Protein in COS Cells

[5882] To express the 61833 gene in COS cells, the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 61833 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5883] To construct the plasmid, the 61833 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 61833 coding sequence starting from the initiation codon; the 3′end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 61833 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 61833_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5α, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5884] COS cells are subsequently transfected with the 61833-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 61833 polypeptide is detected by radiolabelling (³⁵S-methionine or ³⁵S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with ³⁵S-methionine (or ³⁵S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5885] Alternatively, DNA containing the 61833 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 61833 polypeptide is detected by radiolabelling and immunoprecipitation using a 61833 specific monoclonal antibody.

Examples for 26493 Example 10 Identification and Characterization of Human 26493 cDNA

[5886] The human 26493 nucleic acid sequence is recited as follows: CGACCCACGCGTCCGCGTCGGAGCTCCTGCAGACCAGTGCGCGCTCGGGGAGTT (SEQ ID NO: 16) GGCGAGCGGGTGGCGGCTGGGAGACGTCCCGAGCGCACGGGACTGACAGGCGG CAGAAGCCGGGCGGGGTCCGCTGGGCTCCGGACCCGTGCCCACCCAGTTCCAGG GCGGCCCCGGGCGGCCCCGCCCCCTCGGTGAATGCCGCGGGCCGGCCAATCCGG GCAGGCCGCGGCGCCGCGCAGCCTATCAGCGGCCAGAGCTCGCGTGCGCTTCCG CGTTCGCGTGCGCTTCCGCGTTCTCGTGAGCTCCCGGCCCGCTGCCGCAGGGACT GGGAGCGGTCTCCGCAGGGACTGGGAGCGGGCTCCGCAGCGCACTCTAGCCCGC GGCTCGGCTCAGTCGGTCTGCGAGGATCCGGCCCGCCGCCCCCCGGGGGACCCG ATGGCCTCGGAGGGCCTGGCGGGGGCGCTGGCTTCCGTGCTGGCTGGCCAGGGG TCCAGCGTGCACAGCTGCGACTCGGCGCCGGCCGGGGAGCCGCCGGCGCCCGTG CGGCTGCGGAAGAACGTGTGCTACGTGGTGCTGGCCGTGTTCCTCAGCGAGCAG GATGAGGTGCTACTGATCCAGGAGGCCAAGAGGGAGTGCCGGGGGTCGTGGTAC CTGCCTGCGGGGAGAATGGAGCCAGGGGAGACCATCGTGGAGGCGCTGCAGCG GGAGGTGAAGGAGGAGGCGGGGCTGCACTGTGAGCCCGAGACACTGCTGTCCGT GGAGGAGCGGGGCCCCTCCTGGGTCCGCTTCGTGTTCCTCGCTCGCCCCACAGGT GGAATTCTCAAGACTTCCAAGGAGGCCGATGCGGAGTCCCTGCAGGCTGCCTGG TACCCACGGACCTCCCTGCCCACTCCGCTGCGAGCCCATGACATCCTGCACCTGG TTGAACTAGCCGCCCAGTATCGCCAGCAAGCCAGGCACCCTCTCATTCTGCCCCA AGAGCTACCCTGTGATCTGGTCTGCCAGCGGCTCGTGGCTACCTTTACCAGCGCC CAGACAGTGTGGGTGTTAGTGGGCACAGTGGGGATGCCTCACTTGCCTGTCACTG CCTGTGGCCTCGACCCTATGGAGCAGAGGGGTGGCATGAAGATGGCCGTCCTGC GGCTGCTGCAGGAGTGTCTGACCCTGCACCACTTGGTGGTGGAGATCAAGGGGT TGCTTGGACTGCAGCACCTGGGCCGAGATCACAGTGATGGCATCTGTTTGAATGT GCTGGTGACCGTGGCTTTTCGGAGCCCAGGGATCCAGGATGAACCCCCAAAAGT TCGGGGTGAGAACTTCTCTTGGTGGAAGGTGATGGAGGAAGACCTGCAAAGCCA GCTCCTCCAGCGGCTTCAGGGATCCTCTGTTGTCCCAGTGAACAGATAGAGAGGT GGAGGAGGTGACAGGGAGCTAGGCAGCCGTGCTCCCTCCAGTGCGGACTTGTCT CCCTCTGAGGGAGGCAAGAGGCTGGCGATCAGGGATCTTGTTGCATTGGGAGCA GGGGCGGCTCTCCTGGTCCCCAGGAGAGATGCTTTGAGGAGCATTCCTCTAGATT GCACAAGGGACAGTGCCTTTAACCAAGCGAGGAGTCCAAAGCTCAGGACCTGAC TACCCTGAGGGCACGCTGACGCCTCTCCCCAGGGGGATGGGGAGCTTTCTGCAC CCCCAGTGGCATCTCCTCATCACGTTCTGTGCCGTCCTTGGGAAAGGCCTGCATT CTGATCCTTCCAGGCCCTTCGAGCATGGAGGGGCACTGGGGAAGGTCCCCCGAG GGAGGAGCACGTTGCTGAGTAAAGAGGTGTTACTCAMMATAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGGGCGGCCGCTAG ACTAGTC.

[5887] The human 26493 sequence (SEQ ID NO:16), which is approximately 1902 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TAG), which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 1212 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO: 16; SEQ ID NO: 18). The coding sequence encodes a 404 amino acid protein (SEQ ID NO:17), which is recited as follows: MPRAGQSGQAAAPRSLSAARARVRFRVRVRFRVLVSSRPAAAGTGSGLRRDWERA (SEQ ID NO: 17) PQRTLARGSAQSVCEDPARRPPGDPMASEGLAGALASVLAGQGSSVHSCDSAPAGE PPAPVRLRKNVCYVVLAVFLSEQDEVLLIQEAKRECRGSWYLPAGRMEPGETIVEAL QREVKEEAGLHCEPETLLSVEERGPSWVRFVFLARPTGGILKTSKEADAESLQAAWY PRTSLPTPLRAHDILHLVELAAQYRQQARHPLILPQELPCDLVCQRLVATFTSAQTV WVLVGTVGMPHLPVTACGLDPMEQRGGMKMAVLRLLQECLTLHHLVVEIKGLLGL QHLGRDHSDGICLNVLVTVAFRSPGIQDEPPKVRGENFSWWKVMEEDLQSQLLQRL QGSSVVPVNR.

Example 11 Tissue Distribution of 26493 mRNA by TaqMan Analysis

[5888] Endogenous human 26493 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[5889] To determine the level of 26493 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 μg total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction.

[5890] Table 2 below shows expression of 26493 mRNA in various normal and diseased tissues, detected using TaqMan analysis. The mRNA expression data for 26493 indicate that 26493 mRNA was highly expressed in many tissues, for example, coronary tissue, human umbilical vein endothelial cells (HUVEC), kidney, pancreas, brain (hypothalamus and cortex), dorsal root ganglion (DRG), and erythroid cells. More moderate levels of expression were noted in normal artery, nerve, normal breast, prostate (normal and tumor), lung tumor and lung COPD, tonsil, lymph node, neutrophils and megakaryocytes. All tissues included in the panel exhibited at least lower levels of expression. TABLE 2 Tissue Distribution of 25692 mRNA by TaqMan Analysis. Sample Expression Artery normal 1.06 Aorta diseased 0.41 Vein normal 0.18 Coronary SMC 5.30 HUVEC 8.70 Hemangioma 0.18 Heart normal 0.31 Heart CHF 0.35 Kidney 2.34 Skeletal Muscle 0.39 Adipose normal 0.32 Pancreas 5.10 primary osteoblasts 0.37 Osteoclasts (diff) 0.08 Skin normal 0.46 Spinal cord normal 0.26 Brain Cortex normal 8.70 Brain Hypothalamus normal 3.67 Nerve 0.56 DRG (Dorsal Root Ganglion) 4.00 Breast normal 1.20 Breast tumor 0.47 Ovary normal 0.33 Ovary Tumor 0.11 Prostate Normal 1.05 Prostate Tumor 0.88 Salivary glands 0.22 Colon normal 0.35 Colon Tumor 0.61 Lung normal 0.22 Lung tumor 1.36 Lung COPD 0.86 Colon IBD 0.27 Liver normal 0.12 Liver fibrosis 0.18 Spleen normal 0.26 Tonsil normal 0.59 Lymph node normal 0.53 Small intestine normal 0.22 Macrophages 0.03 Synovium 0.09 BM-MNC 0.29 Activated PBMC 0.04 Neutrophils 0.51 Megakaryocytes 0.65 Erythroid 4.73 positive control 2.78

Example 12 Tissue Distribution of 26493 mRNA by Northern Analysis

[5891] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2× SSC at 65° C. A DNA probe corresponding to all or a portion of the 26493 cDNA (SEQ ID NO: 16) can be used. The DNA was radioactively labeled with ³²P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 13 Recombinant Expression of 26493 in Bacterial Cells

[5892] In this example, 26493 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 26493 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB 199. Expression of the GST-26493 fusion protein in PEB 199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB 199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 14 Expression of Recombinant 26493 Protein in COS Cells

[5893] To express the 26493 gene in COS cells, the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 26493 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5894] To construct the plasmid, the 26493 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 26493 coding sequence starting from the initiation codon; the 3′ end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 26493 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 26493_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5□, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5895] COS cells are subsequently transfected with the 26493-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 26493 polypeptide is detected by radiolabelling (³⁵S-methionine or ³⁵S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with ³⁵S-methionine (or ³⁵S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5896] Alternatively, DNA containing the 26493 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 26493 polypeptide is detected by radiolabelling and immunoprecipitation using a 26493 specific monoclonal antibody.

Examples for 58224 Example 15 Identification and Characterization of Human 58224 cDNA

[5897] The human 58224 nucleic acid sequence is recited as follows: TACTATAGGGAGTCGCCCACGCGTCCGGGCAGCGGTTGTGAGGAGTTAGCTCGC (SEQ ID NO: 22) GGCATTGCAGGCTCTGAGAGGAGGGGACCCGGTTCCCGGGTGAGTGTCCAGGC A TG CCAGCGGAACGGCCCGCGGGCAGCGGCGGCTCGGAGGCTCCAGCAATGGTT GAACAACTGGACACTGCTGTGATTACCCCGGCCATGCTAGAAGAGGAAGAACAG CTTGAAGCTGCTGGACTAGAGAGAGAGCGGAAGATGCTGGAAAAGGCTCGCATG TCTTGGGATAGAGAGTCGACAGAAATTCGGTACCGTAGACTTCAACATTTGCTTG AAAAAAGCAATATMTACTCCAAATTTTTATTGACGAAAATGGAACAGCAACAAT TAGAGGAACAGAAGAAGAAAGAAAAATTGGAGAGAAAAAAGGAGTCTTTAAAA GTTAAAAAGGGTAAAAATTCAATTGATGCAAGTGAAGAGAAGCCAGTTATGAGG AAAAAAAGAGGAAGAGAAGATGAATCATACAATATTTCAGAGGTCATGTCAAA AGAGGAAATTTTGTCTGTGGCTAAAAAAAATAAAAAGGAGAATGAGGATGAAA ACTCCTCCTCTACTAATCTCTGTGTGGAAGATCTTCAGAAAAATAAAGATTCGAA TAGTATAATTAAAGATAGATTGTCTGAAACGGTTAGGCAGAATACTAAATTCTTT TTTGACCCAGTCCGGAAGTGTAATGGTCAGCCAGTACCTTTTCAACAACCAAAGC ACTTCACTGGAGGAGTGATGCGATGGTACCAAGTAGAAGGCATGGAATGGCTTA GGATGCTTTGGGAAAATGGAATTAATGGCATTTTAGCAGATGAAATGGGATTGG GTAAGACAGTTCAGTGCATTGCTACTATTGCATTGATGATTCAGAGAGGAGTACC AGGACCTTTTCTTGTCTGTGGCCCTTTGTCTACACTTCCTAACTGGATGGCTGAAT TCAAAAGATTTACACCAGATATCCCTACAATGTTATATCATGGAACCCAGGAGG ACCGTCGAAAATTGGTAAGAAATATTTACAAAAGACAAGGGACACTGCAGATTC ATCCTGTGGTGGTCACATCATTCGAGATCGCTATGCGAGACCAGAATGCTTTACA GCATTGCTATTGGAAATACTTAATAGTAGATGAAGGACACAGGATTAAGAATAT GAAGTGCCGTCTAATCAGGGAGTTAAAACGATTCAATGCTGATAACAAACTTCTT TTGACTGGTACTCCCTTGCAAAACAATTTATCAGAACTTTGGTCATTGCTAAACT TTTTGTTGCCAGATGTATTTGATGACTTGAAAAGCTTTGAGTCTTGGTTTGACATC ACTAGTCTTTCTGAAACTGCTGAAGATATTATTGCTAAAGAAAGAGAACAGAAT GTATTGCATATGCTGCACCAGATTTTAACACCTTTCTTATTGAGAAGACTGAAGT CTGATGTTGCTCTTGAAGTTCCTCCTAAACGAGAAGTAGTCGTTTATGCTCCACTT TCAAAGAAGCAGGAGATCTTTTATACAGCCATTGTGAACCGTACAATTGCAAAC ATGTTTGGATCCAGTGAGAAAGAAACAATTGAGTTAAGTCCTACTGGTCGACCA AAACGACGAACTAGAAAATCAATAAATTACAGCAAAATAGATGATTTCCCTAAT GAATTGGAAAAACTGATCAGTCAAATACAGCCAGAGGTGGACCGAGAAAGAGC TGTTGTGGAAGTGAATATCCCTGTAGAATCTGAAGTTAATCTGAAGCTGCAGAAT ATAATGATGCTACTTCGTAAATGTTGTAATCATCCATATTTGATTGAATATCCTAT AGACCCTGTTACACAAGAATTTAAGATCGATGAAGAATTGGTAACAAATTCTGG GAAGTTCTTGATTTTGGATCGAATGCTGCCAGAACTAAAAAAAAGAGGTCACAA GGTGCTGCTTTTTTCACAAATGACAAGCATGTTGGACATTTTGATGGATTACTGC CATCTCAGAGATTTCAACTTCAGCAGGCTTGATGGGTCCATGTCTTACTCAGAGA GAGAAAAAAACATGCACAGCTTCAACACGGATCCAGAGGTGTTTATCTTCTTAG TGAGTACACGAGCTGGTGGCCTGGGCATTAATCTGACTGCAGCAGATACAGTTA TCATTTATGATAGTGATTGGAACCCCCAGTCGGATCTTCAGGCCCAGGATAGATG TCATAGAATTGGTCAGACAAAGCCAGTTGTTGTTTATCGCCTTGTTACAGCAAAT ACTATCGATCAGAAAATTGTGGAAAGAGCAGCTGCTAAAAGGAAACTGGAAAA GTTGATCATCCATAAAAATCATTTCAAAGGTGGTCAGTCTGGATTAAATCTGTCT AAGAATTTCTTAGATCCTAAGGAATTAATGGAATTATTAAAATCTAGAGATTATG AAAGGGAAATAAAAGGATCAAGAGAGAAGGTCATTAGTGATAAAGATCTAGAG TTGTTGTTAGATCGAAGTGATCTTATTGATCAAATGAATGCTTCAGGACCAATTA AAGAGAAGATGGGGATATTCAAGATATTAGAAAATTCTGAAGATTCCAGTCCTG AATGTTTGTTT TAA AGTGGAGCTCAAGAATAGCTTTTAAAAGTTCTTATTTACAT CTAGTGATTTCCCTGTATTGGGTTTGAAATACTGATTGTCCACTTCACCTTTTTTA TTATATCAGTTGACATGTAACTAGTACCATGCCGTACCTTAAATAGATGGTAATT TTCTGAGCCTTTCCCAAGAACA

[5898] The human 58224 sequence (SEQ ID NO:22), is approximately 2798 nucleotides long including untranslated regions. The nucleic acid sequence includes a preferred initiation codon (ATG) and a termination codon (TAA) which are double underlined and bolded above. Other methionine residues may also be used as initiation codons. The region between and inclusive of the preferred initiation codon and the termination codon is a methionine-initiated coding sequence of about 2514 nucleotides (nucleotides 108 to 2621 of SEQ ID NO:22) designated as SEQ ID NO:24. The coding sequence encodes a 838 amino acid protein (SEQ ID NO:23), the sequence of which is recited as follows: MPAERPAGSGGSEAPAMVEQLDTAVITPAMLEEEEQLEAAGLERERKMLEKARMS (SEQ ID NO: 23) WDRESTEIRYRRLQHLLEKSNIYSKFLLTKMEQQQLEEQKKKEKLERKKESLKVKK GKNSIDASEEKPVMRKKRGREDESYNISEVMSKEEILSVAKKNKKENEDENSSSTNL CVEDLQKNKDSNSIIKDRLSETVRQNTKFFFDPVRKCNGQPVPFQQPKHFTGGVMR WYQVEGMEWLRMLWENGINGILADEMGLGKTVQCIATIALMIQRGVPGPFLVCGPL STLPNWMAEFKRFTPDIPTMLYHGTQEDRRKLVRNIYKRQGTLQIHPVVVTSFEIAM RDQNALQHCYWKYLIVDEGHRIKNMKCRLIRELKRFNADNKLLLTGTPLQNNLSEL WSLLNFLLPDVFDDLKSFESWFDITSLSETAEDIIAKEREQNVLHMLHQILTPFLLRRL KSDVALEVPPKREVVVYAPLSKKQEIFYTAIVNRTIANMFGSSEKETIELSPTGRPKR RTRKSINYSKIDDFPNELEKLISQIQPEVDRERAVVEVNIPVESEVNLKLQNIMMLLRK CCNHPYLIEYPIDPVTQEFKIDEELVTNSGKFLILDRMLPELKKRGHKVLLFSQMTSM LDILMDYCHLRDFNFSRLDGSMSYSEREKNMHSFNTDPEVFIFLVSTRAGGLGINLT AADTVIIYDSDWNPQSDLQAQDRCHRIGQTKPVVVYRLVTANTIDQKIVERAAAKR KLEKLIIHKNHFKGGQSGLNLSKNFLDPKELMELLKSRDYEREIKGSREKVISDKDLE LLLDRSDLIDQMNASGPIKEKMGIFKILENSEDSSPECLF

Example 16 Tissue Distribution of 58224 mRNA

[5899] Endogenous human 58224 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples were internally controlled by the addition of a second set of primers/probe specific for a reference gene such as β2-macroglobulin, GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[5900] To determine the level of 58224 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 μg total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in the left column of the tables below.

[5901] 58224 mRNA expression was analyzed by TaqMan in a number of clinical samples, including human breast, ovary, lung, colon, and liver tissue. In general, but not always, 58224 is likely to be more highly expressed in tumor tissue as compared to normal tissue (See Table 3). For example, expression of 58224 mRNA was expressed at higher levels in breast tumor, ovarian tumor, lung tumor, and colon tumor samples (T) than in control samples (N) of the same tissue. In addition, expression of 58224 mRNA was elevated in liver metastases and in fetal liver (Table 4). TABLE 3 Tissue Relative sample Expression Breast N 0.00 Breast N 0.33 Breast N 0.00 Breast T 0.24 Breast T 0.03 Breast T 0.00 Breast T 0.00 Breast T 3.21 Breast T 0.62 Breast T 0.00 Breast T 1.48 Ovary N 0.11 Ovary N 0.01 Ovary N 0.17 Ovary N 0.00 Ovary T 0.02 Ovary T 0.01 Ovary T 1.31 Ovary T 0.00 Ovary T 0.04 Ovary T 0.55 Ovary T 0.38 Lung N 0.00 Lung N 0.00 Lung N 0.00 Lung N 0.00 Lung T 2.32 Lung T 3.54 Lung T 0.33 Lung T 0.10 Lung T 4.25 Lung T 1.67 Lung T 0.04 Lung T 0.05 Colon N 0.00 Colon N 0.00 Colon N 0.00 Colon N 0.00 Colon T 1.55 Colon T 0.01 Colon T 0.01 Colon T 0.00 Colon T 0.39

[5902] TABLE 3: 58224 expression is upregulated in some breast, ovary, lung, and colon tumor tissue samples. Relative expression is relative to expression of beta2-macroglobulin. TABLE 4 Tissue Relative sample Expression Liver Nor 0.00 Liver Nor 0.00 Liver Met 0.08 Liver Met 0.75 Liver Met 0.65 Liver Met 0.71 Brain N 0.00 Brain N 0.88 Brain N 0.39 Astrocytes 6.07 Brain T 0.02 Brain T 0.00 Brain T 0.00 Brain T 1.72 Brain T 0.00 HMVEC-Arr 0.11 HMVEC-Prol 2.10 Fetal Liver 10.13 Fetal Liver 0.82

[5903] TABLE 4. 58224 expression is upregulated in some liver metastases. 58224 is also expressed in soma brain tissues and fetal liver. Relative expression is relative to expression of β2-macroglobulin

Example 17 Recombinant Expression of 58224 in Bacterial Cells

[5904] In this example, 58224 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 58224 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-25934 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced_PEB 199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 18 Expression of Recombinant 58224 Protein in COS Cells

[5905] To express the 58224 gene in COS cells, the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 58224 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5906] To construct the plasmid, the 58224 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 58224 coding sequence starting from the initiation codon; the 3′end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 58224 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 58224 gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5a, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5907] COS cells are subsequently transfected with the 58224-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989. The expression of the 58224 polypeptide is detected by radiolabelling (35S-methionine or 35S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1988) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with ³⁵S-methionine (or ³⁵S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5908] Alternatively, DNA containing the 58224 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 58224 polypeptide is detected by radiolabelling and immunoprecipitation using a 58224 specific monoclonal antibody.

Examples for 46980 Example 19 Identification and Characterization of Human 46980 cDNA

[5909] The human 46980 nucleic acid sequence is recited as follows: TCCGACCCACGCGTCCGACTAGTTCTAGATCGCGATCTAGAACTAGCGGGGACA (SEQ ID NO: 27) CACTATTGACAGCAGAAACAATGAATTTCCTCCAAACCCGGCAATGTTGGTGGCT CTTGCATTCCTCTGGATGAGCGAATCTAGTTGGGGGGTTCCCGAAGGGGAAGGC GCCTGGGCTTTCAATACATCCTCCTGAATCATACTGCGTTTCAGGTTCCTTAGAA AAATTTGGATGTGTAAAAAGAACTCTTAACGGCGATGCAGGTCTTCCACAGCTA AGGTTGCATTGGAGTTTTCGAAAGACTTATCTTTCTGCAGGCTCGCCTCTGAGCT TTGTCTCCTTGGAGCCACCTCACTTAGACAGCTTCGGATGTGGATGCAGATTTGA ACCATGTTGCGTCCCCAGGGACTGCTATGGCTCCCTTTGTTGTTCACCTCTGTCTG TGTCATGTTAAACTCCAATGTTCTTCTGTGGATAACTGCTCTTGCCATCAAGTTCA CCCTCATTGACAGCCAAGCACAGTATCCAGTTGTCAACACAAATTATGGTAAAAT CCAGGGCCTAAGAACACCATTACCCAGTGAGATCTTGGGTCCAGTGGAGCAGTA CTTAGGGGTCCCCTATGCCTCACCCCCAACTGGAGAGAGGCGGTTTCAGCCACC AGAATCCCCATCCTCCTGGACTGGCATCCGAAATGCTACTCAGTTTTCTGCTGTG TGCCCCCAGCACCTGGATGAAAGATTCTTATTGCATGACATGCTGCCCATCTGGT TTACCACCAGTTTGGATACTTTGATGACCTATGTTCAAGATCAAAATGAAGACTG CCTTTACTTAAACATCTATGTGCCCATGGAAGATGATATTCATGAACAGAACAGT AAGAAGCCTGTTATGGTCTATATCCATGGGGGATCTTACATGGAGGGAACCGGT AACATGATTGATGGCAGCATTTTGGCCAGCTATGGGAACGTCATCGTTATCACCA TTAACTACCGTCTGGGAATACTAGGGTTTTTAAGTACCGGTGACCAGGCAGCAA AAGGCAACTATGGGCTCCTGGATCAGATTCAAGCACTGAGGTGGATTGAGGAGA ATGTCGGAGCCTTTGGCGGGGACCCCAAGAGAGTGACTATCTTTGGCTCGGGGG CTGGGGCCTCCTGTGTCAGCCTGTTGACCCTGTCCCACTACTCAGAAGGTCTCTT CCAGAAGGCCATCATTCAGAGCGGCACTGCCCTGTCCAGCTGGGCAGTGAACTA CCAGCCGGCCAAGTACACTCGGATATTGGCAGACAAGGTCGGCTGCAACATGCT GGACACCACGGACATGGTAGAATGTCTGAAGAACAAGAACTACAAGGAGCTCAT CCAGCAGACCATCACCCCGGCCACCTACCACATAGCCTTTGGGCCGGTGATCGA CGGCGACGTCATCCCAGACGACCCCCAGATCCTGATGGAGCAAGGCGAGTTCCT CAACTACGACATCATGCTGGGCGTCAACCAAGGGGAAGGCCTGAAGTTCGTGGA CGGCATCGTGGATAACGAGGACGGTGTGACGCCCAACGACTTTGACTTCTCCGT GTCCAACTTCGTGGACAACCTTTACGGCTACCCTGAAGGGAAAGACACTTTGCG GGAGACTATCAAGTTCATGTACACAGACTGGGCCGATAAGGAAAACCCGGAGAC GCGGCGGAAAACCCTGGTGGCTCTCTTTACTGACCATCAGTGGGTGGCCCCCGCC GTGGCCACCGCCGACCTGCACGCGCAGTACGGCTCCCCCACCTACTTCTATGCCT TCTATCATCACTGCCAAAGCGAAATGAAGCCCAGCTGGGCAGATTCGGCCCATG GCGATGAAGTCCCCTATGTCTTCGGCATCCCCATGATCGGTCCCACAGAGCTCTT CAGTTGTAATTTCTCCAAGAACGACGTCATGCTCAGTGCCGTGGTGATGACCTAC TGGACGAACTTCGCCAAAACTGGTGATCCAAACCAACCAGTTCCTCAGGATACC AAGTTCATTCATACAAAACCCAATCGCTTTGAAGAAGTGGCCTGGTCCAAGTATA ATCCCAAAGACCAGCTCTATCTGCATATTGGCTTGAAACCCAGAGTGAGAGATC ACTACCGGGCAACGAAAGTGGCTTTCTGGTTGGAATTGGTTCCTCATTTGCACAA CTTGAACGAGATATTCCAGTATGTTTCAACAACCACAAAGGTTCCTCCACCAGAC ATGACATCATTTCCCTATGGCACCCGGCGATCTCCCGCCAAGATATGGCCAACCA CCAAACGCCCAGCAATCACTCCTGCCAACAATCCCAAACACTCTAAGGACCCTC ACAAAACAGGGCCCGAGGACACAACTGTCCTCATTGAAACCAAACGAGATTATT CCACCGAATTAAGTGTCACCATTGCCGTCGGGGCGTCGCTCCTCTTCCTCAACAT CTTAGCCTTTGCGGCGCTGTACTACAAAAAGGACAAGAGGCGCCATGAGACTCA CAGGCACCCCAGTCCCCAGAGAAACACCACAAATGATATCACTCACATCCAGAA CGAAGAGATCATGTCTCTGCAGATGAAGCAGCTGGAACACGATCACGAGTGTGA GTCGCTGCAGGCACACGACACGCTGAGGCTCACCTGCCCTCCAGACTACACCCT CACGCTGCGCCGGTCGCCGGATGACATCCCATTTATGACGCCAAACACCATCAC CATGATTCCAAACACATTGATGGGGATGCAGCCTTTACACACTTTTAAAACCTTC AGTGGAGGACAAAACAGTACAAATTTACCCCACGGACATTCCACCACTAGAGTA TAGCTTTTCCCTATTTCCCCTCCTATCCCTCTGCCCCTACTGCTCAGCAATGTAAA AGAGACAAATAAGGAGAAAGAAAATCTCCAAACCAGGAATGTTTTTGTGCCACT GACTTTAGATAAAAATGCAAAAGGGCAGTCATCCTGTCCCAGCAGACCCTTCTC ATTGGCATTTTCCAGTATTGTGAGATCAATTTCTGACCATATGAAATGTGAAAAG TATATGTTTCTGTTACAATACTGCTTTAAGATCTAAACCATGCCAACAGATGTTTC GTGTGACTAGGACATCACCATTTCAAGGAACTGTGTGTTTCCAACATCATGGTAG CAGCACACACTTCCAAAGCTCAGCCAGGGACACTTAATATTTTTTAATTACAATG GAAATTTAAACATTTTTATGTGGGCTACACAATGGATGGCTCTTCTTAAGTGAAG AAAGACTCTATAGGCTTTTACACAGCACATGAAGCAGTAATCCAGAAAGAAGGA AATGCAGAATTTTATTATCAAAGTAAGCGAATTGACTGTGCAGAAAAATTGTAG GGTTCTGTGGAAGGAGGTATTCTGCCAGCCTGAACTATATTTAAGAAACTTTGTA AAAAATAAAAATGTATATAGCTGTGAGCTCAAACAAAAACTGCAAAAAAAAAA AAAAAAAAAAAAA.

Example 20 Tissue Distribution of 46980 mRNA by TaqMan Analysis

[5910] Endogenous human 46980 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[5911] To determine the level of 46980 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 μg total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Table 5 below. Expression levels are relative to β2-macroglobulin. TABLE 5 Tissue Relative Expression Adrenal Gland 0.93 Brain 74.58 Heart 0.82 Kidney 0.41 Liver 0.24 Lung 0.40 Mammary Gland 1.82 Placenta 2.73 Prostate 3.87 Salivary Gland 0.56 Muscle 0.92 Sm. Intestine 2.72 Spleen 0.24 Stomach 2.45 Teste 12.05 Thymus 5.00 Trachea 0.38 Uterus 1.01 Spinal Cord 7.39 Skin 0.13 DRG 8.34

[5912] 46980 mRNA was detected as highly expressed in brain tissue. 46980 expression was also found in testis, spinal cord, dorsal root ganglia, prostate, and thymus (Table 5).

Example 21 Tissue Distribution of 46980 mRNA by Northern Analysis

[5913] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2× SSC at 65° C. A DNA probe corresponding to all or a portion of the 46980 cDNA (SEQ ID NO:27) can be used. The DNA was radioactively labeled with ³²P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 22 Recombinant Expression of 46980 in Bacterial Cells

[5914] In this example, 46980 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 46980 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-46980 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB 199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 23 Expression of Recombinant 46980 Protein in COS Cells

[5915] To express the 46980 gene in COS cells (e.g., COS-7 cells, CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182), the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 46980 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5916] To construct the plasmid, the 46980 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 46980 coding sequence starting from the initiation codon; the 3′ end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 46980 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 46980_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5α, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5917] COS cells are subsequently transfected with the 46980-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 46980 polypeptide is detected by radiolabelling (³⁵S-methionine or ³⁵S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with ³⁵S-methionine (or ³⁵S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5918] Alternatively, DNA containing the 46980 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 46980 polypeptide is detected by radiolabelling and immunoprecipitation using a 46980 specific monoclonal antibody.

Examples for 32225 Example 24 Identification and Characterization of Human 32225 cDNA

[5919] The human 32225 nucleic acid sequence is recited as follows: GGCTCGCCAGGACCTGGCAAGGCTTGTTTACTATGGCCGATGATCTGGAGCAGC (SEQ ID NO: 33) AGTCTCAAGGCTGGCTGAGTAGCTGGCTGCCCACGTGGCGCCCCACTTCCATGTC TCAGCTGAAGAATGTGGAAGCCAGGATCCTCCAGTGTCTCCAGAATAAGTTCCT GGCCAGATATGTATCCCTCCCAAACCAGAATAAGATCTGGACGGTGACTGTGAG CCCCGAGCAAAACGACCGCACCCCCTTGGTGATGGTGCATGGTTTTGGGGGCGG CGTGGGTCTCTGGATCCTCAACATGGACTCACTGAGTGCCCGCCGCACACTGCAC ACCTTCGATCTGCTTGGCTTCGGGCGAAGCTCAAGGCCAGCATTCCCAAGGGAC CCGGAGGGGGCTGAGGATGAGTTTGTGACATCGATAGAGACATGGCGGGAGACC ATGGGGATCCCCAGCATGATCCTCCTGGGGCACAGTTTGGGAGGATTCCTGGCC ACTTCTTACTCAATCAAGTACCCTGATAGAGTTAAACACCTCATCCTGGTGGACC CATGGGGCTTTCCCCTCCGACCAACTAACCCCAGTGAGATCCGTGCACCCCCAGC CTGGGTCAAAGCCGTGGCATCTGTCCTAGGACGTTCCAATCCATTGGCTGTTCTT CGAGTAGCTGGGCCCTGGGGGCCTGGTCTGGTGCAGCGATTCCGGCCGGACTTC AAACGCAAGTTTGCAGACTTCTTTGAAGATGATACCATATCAGAGTATATTTACC ACTGCAACGCACAGAATCCCAGTGGTGAGACAGCATTCAAAGCCATGATGGAGT CCTTTGGCTGGGCCCGGCGCCCTATGCTGGAGCGAATTCACTTGATTCGAAAAGA TGTGCCTATCACTATGATCTACGGGTCCGACACCTGGATAGATACCAGTACGGGA AAAAAGGTGAAGATGCAGCGGCCGGATTCCTATGTCCGAGACATGGAGATTAAG GGTGCCTCCCACCATGTCTATGCTGACCAGCCACACATCTTCAATGCTGTGGTGG AGGAGATCTGCGACTCAGTTGATTGAGCTGCTCTCTGAAGAGGAAGAGGAGAAA GCCAGAGAGTCACTCTTACCTCCCTGTCTGCTTACTCACCCACTCTGTCCTTTCCT CACCAACTAACATGTGCCAGCCAGGCAGAGTCTTGTGCTGTTCCCAGAACAGGA CGACAGTGAAAAGAACACTCTTGACCCTACACTGAAGGCTGAAGGCAGAAGCCA CAAGAGGCCTTGAGTGCCACCCCCAGGGAAGAACATAAAGGGTTGCACAATGCC ACCCATCCACTCCTTGCCAAGTGTTACCCAGATGGTGGAGGATGTGAAGGGATT GCACCAAGCCACATTCACTCTCTCTGTGGCCTTTCTTCCTCTGGGCAAAGAAGGG CTTCCAGTGGCCTTTCCTCACTCTGTAGTGTTTGTGGGGATAGGTTCCATGCAAG AACACCTTCCTCCTCCATCCCCCACTTCACCCCATCCCATACCAGTTCCATCCAG GGTCTGCTTAACTGCCAAGAGCAGGTCCTGGAGTTCCCTTCACCTGCAGAGTCCT TTTCATGACCTAGGAGGTCTTATTCAAAGCCCTCATTGACAGAGGAGGAAACAG GCCAAGGCAGGACATGGCTGGACCATGGTGATACAGCTCTGTGTGATTCAAGTT CTGGCAGAGCTTGTAAGGCTAGAGCCCAGGTCTGCCGACACCCTGTGCTTGTTGC ACACTTGATTTGCTAAGGCTGGAGACAGGCACCATTGCCATGGGGCTGGTCCTA GTCACTGGCCGAGGATAAGCCCGTCCCTGTCCCACATTCTAGCCCCACTATGCGG GGGTGCTGTTGTCCTGCCTGTGTCTCATCCCCAGCTGCCTAAGCTAGGGACACTC AAGTGCTTCCTTCCTTGCCCCATCTTCCTCCCAACTGGAGGCCTCTGAGCCTCCCC TGTGCCTTGGGCCCTGAAGCCCCATATGTAGTATAGAGCAAAGGTGGCTCCTGGT GAAGAGAGGGTGGAAAGGCCCTTCAGCCCCAGGGCCATGTCTGGGTTCTCCATG CCCATCAGTCTCTGCAGTTTCTCTACCTGCCCCCAGAGCTGAGGCCATCTGCAAG CCCCTGCCCATGGCCCAATGGGGAGCCTCCAGCCACAAGTTCCCTGTCCTTATCA GCCACTGGGTGGTTCCCACTGCATGACCCTCTATCCCTGCCATCTGTCCCCATGG TTTCCAGCTCAATCCACCCCTGACCCATCTGTCAGCTTTTTCCCAGGGAGCCGTTT CAGGGGTTCTG.

[5920] The human 32225 sequence (FIG. 18; SEQ ID NO:33), which is approximately 2305 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TGA) which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 1029 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:33; SEQ ID NO:35). The coding sequence encodes a 342 amino acid protein (SEQ ID NO:34), which is recited as follows: MADDLEQQSQGWLSSWLPTWRPTSMSQLKNVEARILQCLQNKFLARYVSLPNQNKI (SEQ ID NO: 34) WTVTVSPEQNDRTPLVMVHGFGGGVGLWILNMDSLSARRTLHTFDLLGFGRSSRPA FPRDPEGAEDEFVTSIETWRETMGIPSMILLGHSLGGFLATSYSIKYPDRVKHLILVDP WGFPLRPTNPSEIRAPPAWVKAVASVLGRSNPLAVLRVAGPWGPGLVQRFRPDFKR KFADFFEDDTISEYIYHCNAQNPSGETAFKAMMESFGWARRPMLERIHLIRKDVPIT MIYGSDTWIDTSTGKKVKMQRPDSYVRDMEIKGASHHVYADQPHIFNAVVEEICDS VD.

Example 25 Tissue Distribution of 32225 mRNA by TaqMan Analysis

[5921] Endogenous human 32225 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[5922] To determine the level of 32225 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 μg total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Tables 6 and 7. 32225 mRNA was detected in brain, liver, breast, ovary, colon, and lung tissues. In addition, 32225 expression was altered in some tumor samples, as compared to the appropriate normal tissues. TABLE 6 Relative Tissue Type Expression PIT 400 Breast Normal 1.46 PIT 56 Breast Normal 3.08 MDA 106 Breast Tumor 2.06 MDA 234 Breast Tumor 1.33 NDR 57 Breast Tumor 0.23 MDA 304 Breast Tumor 0.73 NDR 58 Breast Tumor 0.40 NDR 132 Breast Tumor 13.23 NDR 07 Breast Tumor 0.36 NDR 12 Breast Tumor 21.94 PIT 208 Ovary Normal 8.61 CHT 620 Ovary Normal 8.14 CHT 619 Ovary Normal 15.63 CLN 03 Ovary Tumor 0.17 CLN 05 Ovary Tumor 1.53 CLN 17 Ovary Tumor 4.23 CLN 07 Ovary Tumor 0.31 CLN 08 Ovary Tumor 0.43 MDA 216 Ovary Tumor 1.84 CLN 012 Ovary Tumor 23.77 MDA 25 Ovary Tumor 21.72 MDA 183 Lung Normal 0.24 CLN 930 Lung Normal 0.66 MDA 185 Lung Normal 0.57 CHT 816 Lung Normal 0.07 MPI 215 Lung Tumor-SmC 11.60 MDA 259 Lung Tumor-PDNSCCL 5.70 CHT 832 Lung Tumor-PDNSCCL 2.48 MDA 253 Lung Tumor-PDNSCCL 32.46 CHT 814 Lung Tumor-SCC 68.63 CHT 911 Lung Tumor-SCC 65.38 CHT 726 Lung T-SCC 4.22 CHT 845 Lung T-AC 17.10

[5923] 32225 is expressed in ovary and breast tissues, and marginally expressed in lung tissue. The expression of 32225 in lung tumors is dramatically elevated in all of the samples tested. Similarly, 32225 expression is elevated in a subset of breast and ovary tumor samples. In a distinct subset of ovary tumors, however, 32225 expression is decreased relative to the levels in normal tissues. TABLE 7 Relative Tissue Type Expression CHT 396 Colon Normal 0.15 CHT 519 Colon Normal 0.15 CHT 416 Colon Normal 0.65 CHT 452 Colon Normal 0.19 CHT 398 Colon Tumor 26.83 CHT 807 Colon Tumor 0.04 CHT 528 Colon Tumor 9.01 CHT 368 Colon Tumor 0.09 CHT 372 Colon Tumor 0.84 CHT 01 Liver Metastasis 7.73 CHT 3 Liver Metastasis 12.13 CHT 896 Liver Metastasis 2.02 NDR 217 Liver Metastasis 0.47 PIT 260 Liver Normal 0.46 PIT 229 Liver Normal 43.89 MGH 16 Brain Normal 48.03 MCL 53 Brain Normal 72.54 MCL 390 Brain Normal 74.58 Astrocytes 131.21 CHT 201 Brain Tumor 1.17 CHT 216 Brain Tumor 5.45 CHT 501 Brain Tumor 6.35 CHT 1273 Brain Tumor 185.57 A24 HMVEC-Arrested 31.36 C48 HMVEC-Proliferating 21.27 BWH 54 Fetal Liver 117.85 BWH 75 Fetal Liver 20.98 CHT 765 Wilms Tumor 32.35 CHT 1424 Endometrial AC 23.93 MCL 377 Brain Normal 129.86 BWH 58 Fetal Adrenal 496.55 PIT 251 Fetal Adrenal 19.57 PIT 213 Renal Tumor 0.07

[5924] Expression of 32225 is very high in the fetal adrenal gland and liver, as well as in astrocytes and the brain. 32225 is also expressed in endothelial cells and cells cultured from an endometrial cancer, a Wilms tumor, and several liver metastases. The expression of 32225 was marginal in colon tissue, but elevated in a subset of colon tumors. As with ovary tumors (see Table 6), a subset of brain tumors showed an increase in 32225 expression, while a distinct subset displayed a decrease in 32225 expression relative to the levels observed in normal brain tissue samples.

Example 26 Tissue Distribution of 32225 mRNA by Northern Analysis

[5925] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2×SSC at 65° C. A DNA probe corresponding to all or a portion of the 32225 cDNA (SEQ ID NO:33) can be used. The DNA was radioactively labeled with ³²P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 27 Recombinant Expression of 32225 in Bacterial Cells

[5926] In this example, 32225 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 32225 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-32225 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB 199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 28 Expression of Recombinant 32225 Protein in COS Cells

[5927] To express the 32225 gene in COS cells (e.g., COS-7 cells, CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182), the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 32225 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5928] To construct the plasmid, the 32225 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 32225 coding sequence starting from the initiation codon; the 3′ end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 32225 coding sequence. The PCR amplified fragment and the pCDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 32225_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5α, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5929] COS cells are subsequently transfected with the 32225-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 32225 polypeptide is detected by radiolabelling (³⁵S-methionine or ³⁵S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with ³⁵S-methionine (or ³⁵S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5930] Alternatively, DNA containing the 32225 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 32225 polypeptide is detected by radiolabelling and immunoprecipitation using a 32225 specific monoclonal antibody.

Examples for 47508 Example 29 Identification and Characterization of Human 47508 cDNA

[5931] The human 47508 nucleic acid sequence is recited as follows: CGGAGCTTCCGAACCAGGCGGGATTCCACCGGGTATTTGCCTGCGGAGGCGGGA (SEQ ID NO: 41) CTTCGGGCTTGATGGGCGTTGGGGGTGGCCTTCCTGCGGGCAGGCTCTCTGTGTC GCAACACTGGCGGGGCGGGCCAAATCGGCCAGAGCTCTGCCCCCAGAGGACGC GGCTAAGCCCGGGGGCGTGTCCTGGGCTGGCCCCACCCGCGCCCCGCCCCGCCC CGCCCGGTCGCGGAGCTGCGGCCAGCTTTGGGAGGGCCGGCCCCGGGATGCTAC ACACAACCCAGCTGTACCAGCATGTGCCAGAGACACGCTGGCCAATCGTGTACT CGCCGCGCTACAACATCACCTTCATGGGCCTGGAGAAGCTGCATCCCTTTGATGC CGGAAAATGGGGCAAAGTGATCAATTTCCTAAAAGAAGAGAAGCTTCTGTCTGA CAGCATGCTGGTGGAGGCGCGGGAGGCCTCGGAGGAGGACCTGCTGGTGGTGCA CACGAGGCGCTATCTTAATGAGCTCAAGTGGTCCTTTGCTGTTGCTACCATCACA GAAATCCCCCCCGTTATCTTCCTCCCCAACTTCCTTGTGCAGAGGAAGGTGCTGA GGCCCCTTCGGACCCAGACAGGAGGAACCATAATGGCGGGGAAGCTGGCTGTGG AGCGAGGCTGGGCCATCAACGTGGGGGGTGGCTTCCACCACTGCTCCAGCGACC GTGGCGGGGGCTTCTGTGCCTATGCGGACATCACGCTCGCCATCAAGTTTCTGTT TGAGCGTGTGGAGGGCATCTCCAGGGCTACCATCATTGATCTTGATGCCCATCAG GGCAATGGGCATGAGCGAGACTTCATGGACGACAAGCGTGTGTACATCATGGAT GTCTACAACCGCCACATCTACCCAGGGGACCGCTTTGCCAAGCAGGCCATCAGG CGGAAGGTGGAGCTGGAGTGGGGCACAGAGGATGATGAGTACCTGGATAAGGT GGAGAGGAACATCAAGAAATCCCTCCAGGAGCACCTGCCCGACGTGGTGGTATA CAATGCAGGCACCGACATCCTCGAGGGGGACCGCCTTGGGGGGCTGTCCATCAG CCCAGCGGGCATCGTGAAGCGGGATGAGCTGGTGTTCCGGATGGTCCGTGGCCG CCGGGTGCCCATCCTTATGGTGACCTCAGGCGGGTACCAGAAGCGCACAGCCCG CATCATTGCTGACTCCATACTTAATCTGTTTGGCCTGGGGCTCATTGGGCCTGAG TCACCCAGCGTCTCCGCACAGAACTCAGACACACCGCTGCTTCCCCCTGCAGTGC CCTGACCCTTGCTGCCCTGCCTGTCACGTGGCCCTGCCTATCCGCCCCTTAGTGCT TTTTGTTTTCTAACCTCATGGGGTGGTGGAGGCAGCCTTCAGTGAGCATGGAGGG GCAGGGCCATCCCTGGCTGGGGCCTGGAGCTGGCCCTTCCTCTACTTTTCCCTGC TGGAAGCCAGAAGGGCTTGAGGCCTCTATGGGTGGGGGCAGAAGGCAGAGCCT GTGTCCCAGGGGGACCCACACGAAGTCACCAGCCCATAGGTCCAGGGAGGCAG GCAGG.

[5932] The human 47508 sequence (FIG. 21; SEQ ID NO:41), which is approximately 1579 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TGA) which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 1242 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:41; SEQ ID NO:43). The coding sequence encodes a 413 amino acid protein (SEQ ID NO:42), which is recited as follows: MGVGGGLPAGRLSVSQHWRGGPNRPELCPQRTRLSPGACPGLAPPAPRPAPPGRGA (SEQ ID NO: 42) AASFGRAGPGMLHTTQLYQHVPETRWPIVYSPRYNITFMGLEKLHPFDAGKWGKVI NFLKEEKLLSDSMLVEAREASEEDLLVVHTRRYLNELKWSFAVATITEIPPVIFLPNFL VQRKVLRPLRTQTGGTIMAGKLAVERGWAINVGGGFHHCSSDRGGGFCAYADITLA IKFLFERVEGISRATIIDLDAHQGNGHERDFMDDKRVYIMDVYNRHIYPGDRFAKQA IRRKVELEWGTEDDEYLDKVERNIKKSLQEHLPDVVVYNAGTDILEGDRLGGLSISP AGIVKRDELVFRMVRGRRVPILMVTSGGYQKRTARIIADSILNLFGLGLIGPESPSVSA QNSDTPLLPPAVP.

Example 30 Tissue Distribution of 47508 mRNA by TaqMan Analysis

[5933] Endogenous human 47508 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[5934] To determine the level of 47508 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 μg total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Tables 8-11. 47508 mRNA was detected in all tissues and cell lines analyzed (Tables 8-11), including normal and cancerous tissues of the breast, ovary, lung, and colon (Table 8). TABLE 8 Relative Tissue Expression PIT 400 Breast Normal 1.56 PIT 372 Breast Normal 1.99 CHT 558 Breast Normal 0.47 CLN 168 Breast Tumor: IDC 11.20 MDA 304 Breast Tumor: moderately differentiated IDC 0.71 NDR 57 Breast Tumor: poorly differentiated IDC 4.19 NDR 132 Breast Tumor: IDC/ILC 10.10 CHT 562 Breast Tumor: IDC 20.69 NDR 12 Breast Tumor 19.17 PIT 208 Ovary Normal 7.04 CHT 620 Ovary Normal 5.98 CLN 03 Ovary Tumor 2.78 CLN 17 Ovary Tumor 9.42 MDA 25 Ovary Tumor 18.52 MDA 216 Ovary Tumor 6.26 CLN 012 Ovary Tumor 13.51 MDA 185 Lung Normal 0.76 CLN 930 Lung Normal 2.04 MDA 183 Lung Normal 0.51 MPI 215 Lung Tumor: SmC 2.21 MDA 259 Lung Tumor: PDNSCCL 2.87 CHT 832 Lung Tumor: PDNSCCL 3.09 CHT 911 Lung Tumor: SCC 3.52 MDA 262 Lung Tumor: SCC 10.97 CHT 211 Lung Tumor: adenocarcinoma 2.72 MDA 253 Lung Tumor: PDNSCCL 0.46 NHBE 27.68 CHT 396 Colon Normal 21.87 CHT 523 Colon Normal 2.17 CHT 452 Colon Normal 0.55 CHT 382 Colon Tumor: moderately differentiated 6.09 CHT 528 Colon Tumor: moderately differentiated 4.65 CLN 609 Colon Tumor 4.91 CHT 372 Colon Tumor: poorly/moderately differentiated 3.27 NDR 217 Colon-Liver Metastasis 3.38 NDR 100 Colon-Liver Metastasis 7.26 PIT 260 Liver Normal (female) 0.21 ONC 102 Hemangioma 0.67 A24 HMVEC Arrested 3.01 C48 HMVEC Proliferating 3.97

[5935] As shown in table 8, 47508 is expressed in normal breast, ovary, lung, colon, liver, and endothelial cells, as well as tumors of the breast, ovary, lung, colon, and liver (metastases originating from the colon). In most breast tumors analyzed, 47508 expression is elevated relative to that observed in normal breast tissue. In addition, a subset of ovary and lung tumors display an increase in 47508 expression, and all of the colon tumor samples display an increase in 47508 expression relative to the majority of normal colon tissue samples analyzed. Abbreviations used in Table 8 include: IDC—invasive ductal carcinoma; ILC—invasive lobular carcinoma; SmC-; PDNSCCL—poorly differentiated non-small cell carcinoma of the lung; SCC—squamous cell carcinoma; and HMVEC—human vein endothelial cells. TABLE 9 Relative Cell Line Expression MCF10MS 16.35 MCF10A 10.27 MCF10AT.c11 12.47 MCF10AT.c13 12.74 MCF10AT1 6.92 MCF10AT3B 10.64 MCF10CA1a.c11 6.90 MCF10CA1a.c11 Agar 16.63 MCF10A.m25 Plastic 18.65 MCF10CA Agar 10.13 MCF10CA Plastic 4.41 MCF3B Agar 23.60 MCF3B Plastic 14.28 MCF10A EGF 0 hr 6.30 MCF10A EGF 0.5 hr 6.78 MCF10A EGF 1 hr 5.96 MCF10A EGF 2 hr 5.43 MCF10A EGF 4 hr 7.21 MCF10A EGF 8 hr 5.66 MCF10A IGF1A 0 hr 18.84 MCF10A IGF1A 0.5 hr 25.30 MCF10A IGF1A 1 hr 19.51 MCF10A IGF1A 3 hr 32.69 MCF10A IGF1A 24 hr 51.30 MCF10AT3B.c15 Plastic 15.15 MCF10AT3B.c16 Plastic 23.52 MCF10AT3B.c13 Plastic 22.17 MCF10AT3B.c11 Plastic 26.10 MCF10AT3B.c14 Plastic 11.88 MCF10AT3B.c12 Plastic 25.03 MCF10AT3B.c15 Agar 29.36 MCF10AT3B.c16 Agar 35.28 MCF-7 116.23 ZR--75 40.11 T47D 54.22 MDA-231 11.76 MDA-435 10.42 SkBr3 49.89 Hs578Bst 6.17 Hs578T 9.29

[5936] As shown in table 9, 47508 mRNA is expressed in many different cell lines derived from breast tumors, and under many different culture conditions. In some cells lines, e.g., MCF10CA, MCF3B, and MCF10AT3B.cl5 and c16 cell lines, expression of 47508 is elevated in cells that are plated on agar, as compared to cells that are plated on plastic. Cells that are plated on agar tend to have a more transformed phenotype, so the increase in 47508 expression in breast tumor cells plated on agar is consistent with the increase in 47508 expression observed in many tumors (see Table 8). In addition, MCF10A cells display an increase in 47508 expression in response to Insulin Growth Factor (IGF) 1A after 3 hours, with an even larger increase in 47508 expression after 24 hours of exposure to IGF1A. TABLE 10 Relative Cell Line Expression H460 +p16 24 hr 1.15 H460 +p16 48 hr 0.72 H460 +p16 72 hr 0.41 H460 +p16 96 hr 0.28 H460 −p16 24 hr 0.32 H460 −p16 48 hr 0.48 H460 −p16 72 hr 0.23 H460 −p16 96 hr 0.44 H460 −p16 24 hr 0.29 H460 −p16 48 hr 0.47 H460 −p16 72 hr 0.29 H460 −p16 96 hr 0.45 H460 +p16 48 hr 0.34 H460 +p16 72 hr 0.44 H460 +p16 96 hr 0.22

[5937] As shown in Table 10, H460 large cell lung carcinoma cells express 47508 in both the presence and absence of the tumor suppressor gene, p16. TABLE 11 Relative Cell Line Expression NHBE 43.59 A549 (BA) 36.78 H460 (LCLC) 1.91 H23 (AC) 36.02 H522 (AC) 62.28 H125 (AC/SCC) 56.52 H520 (SCC) 26.01 H69 (SCLC) 3.25 H345 (SCLC) 42.25 H460 INCX 24 hr 1.91 H460 p16 24 hr 2.91 H460 INCX 48 hr 1.65 H460 p16 48 hr 2.20 H460 INCX Stable Plas 3.75 H460 p16 Stable Plas 3.40 H460 NA-Agar 1.19 H460 Incx stable Agar 0.38 H460 p16 stable Agar 0.90 H125 Incx 96 hr 43.74 H125 p53 96 hr 36.52 H345 Mock 144 hr 66.06 H345 Glue 144 hr 42.54 H345 VIP 144 hr 80.21

[5938] As shown in Table 11, a number of different lung cell lines express 47508, with neither the p16 or p53 tumor suppressor genes having a strong impact on 47508 expression under the conditions examined. Expression of 47508 is lowest in the H460 large cell lung carcinoma cells. Abbreviations used in Table 11 include: NHBE—primary normal human bronchial epithelial cells; LCLC—large cell lung carcinoma; AC—adenocarcinoma; SCC—squamous cell carcinoma; SCLC—small cell lung carcinoma; INCX; Gluc; and VIP.

Example 31 Tissue Distribution of 47508 mRNA by Northern Analysis

[5939] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2×SSC at 65° C. A DNA probe corresponding to all or a portion of the 47508 cDNA (SEQ ID NO:41) can be used. The DNA was radioactively labeled with ³²P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 32 Recombinant Expression of 47508 in Bacterial Cells

[5940] In this example, 47508 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 47508 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-47508 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB 199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 33 Expression of Recombinant 47508 Protein in COS Cells

[5941] To express the 47508 gene in COS cells (e.g., COS-7 cells, CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182), the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 47508 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5942] To construct the plasmid, the 47508 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 47508 coding sequence starting from the initiation codon; the 3′end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 47508 coding sequence. The PCR amplified fragment and the pCDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CLAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 47508_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5α, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5943] COS cells are subsequently transfected with the 47508-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 47508 polypeptide is detected by radiolabelling (³⁵S-methionine or ³⁵S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with ³⁵S-methionine (or ³⁵S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5944] Alternatively, DNA containing the 47508 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 47508 polypeptide is detected by radiolabelling and immunoprecipitation using a 47508 specific monoclonal antibody.

Examples for 56939 Example 34 Identification and Characterization of Human 56939 cDNA

[5945] The human 56939 nucleic acid sequence is recited as follows: GCTCCAAGATGTCAGCAACGCTGATCCTGGAGCCCCCAGGCCGCTGCTGCTGGA (SEQ ID NO: 48) ACGAGCCGGTGCGCATTGCCGTGCGCGGCCTGGCCCCGGAGCAGCGGGTTACGC TGCGCGCGTCCCTGCGCGACGAGAAGGGCGCGCTCTTCCGGGCCCACGCGCGCT ACTGCGCCGACGCCTGCGGCGAGCTGGACCTGGAGCGCGCACCCGCGCTGGGCG GCAGCTTCGCGGGACTCGAGCCCATGGGGCTGCTCTGGGCCCTGGAACCCGAGA AGCCTTTTTGGCGCTTCCTGAAGCGGGACGTACAGATTCCTTTTGTCGTGGAGTT GGAGGTGCTGGACGGCCACGACCCCGAGCCTGGACGGCTGCTGTGCCAGGCGCA GCACGAGCGCCACTTCCTCCCGCCAGGGGTGCGGCGCCAGTCGGTGCGAGCGGG CCGGGTGCGCGCCACGCTCTTCCTGCCGCCAGGACCTGGACCCTTCCCAGGGATC ATTGACATCTTTGGTATTGGAGGGGGCCTCTTGGAATATCGAGCCAGCCTCCTTG CTGGCCATGGCTTTGCCACGTTGGCTCTAGCTTATTATAACTTTGAAGATCTCCCC AATAACATGGACAACATATCCCTGGAGTACTTCGAAGAAGCCGTATGCTACATG CTTCAACATCCCCAGGTAAAAGGCCCAGGCATTGGGCTTTTGGGCATTTCTCTAG GAGCTGATATTTGTCTCTCAATGGCCTCATTCTTGAAGAATGTCTCAGCCACAGT TTCCATCAATGGATCTGGGATCAGTGGGAACACAGCCATCAACTATAAGCACAG TAGCATTCCACCATTGGGCTATGACCTGAGGAGAATCAAGGTAGCTTTCTCAGGC CTCGTGGACATCGTGGATATAAGGAATGCTCTCGTAGGAGGGTACAAGAACCCC AGCATGATTCCAATAGAGAAGGCCCAGGGGCCCATCCTGCTCATTGTTGGTCAG GATGACCATAACTGGAGAAGTGAGTTGTATGCCCAAACAGTCTCTGAACGGTTA CAGGCCCATGGAAAGGAAAAACCCCAGATCATCTGTTACCCTGGGACTGGGCAT TACATCGAGCCTCCTTACTTCCCCCTGTGCCCAGCTTCCCTTCACAGATTACTGAA CAAACATGTTATATGGGGTGGGGAGCCCAGGGCTCATTCTAAGGCCCAGGAAGA TGCCTGGAAGCAAATTCTAGCCTTCTTCTGCAAACACCTGGGAGGTACCCAGAA AACAGCTGTCCCTAAATTGTAATGCATTTGTCTGTTGTTGACATGAGAGATTCAA GATCAGATTCTAGTGTTCAGTAACCCTATGTGAATCAGATGTCTCCTGGATAACA TTAAAGCCATGTCTTTGTCATTAAAAAAA.

[5946] The human 56939 sequence (SEQ ID NO:48), which is approximately 1391 nucleotides long, and includes an initiation codon (ATG) and a termination codon (TAA) which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 1266 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:48; SEQ ID NO:50). The coding sequence encodes a 421 amino acid protein (SEQ ID NO:49), which is recited as follows: MSATLILEPPGRCCWNEPVRIAVRGLAPEQRVTLRASLRDEKGALFRAHARYCADA (SEQ ID NO: 49) CGELDLERAPALGGSFAGLEPMGLLWALEPEKPFWRFLKRDVQIPFVVELEVLDGH DPEPGRLLCQAQHERHFLPPGVRRQSVRAGRVRATLFLPPGPGPFPGIIDIFGIGGGLL EYRASLLAGHGFATLALAYYNFEDLPNNMDNISLEYFEEAVCYMLQHPQVKGPGIG LLGISLGADICLSMASFLKNVSATVSINGSGISGNTAINYKHSSIPPLGYDLRRIKVAFS GLVDIVDIRNALVGGYKNPSMIPIEKAQGPILLIVGQDDHNWRSELYAQTVSERLQA HGKEKPQIICYPGTGHYIEPPYFPLCPASLHRLLNKHVIWGGEPRAHSKAQEDAWKQ ILAFFCKHLGGTQKTAVPKL.

Example 35 Tissue Distribution of 56939 mRNA by TaqMan Analysis

[5947] Endogenous human 56939 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[5948] To determine the level of 56939 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 μg total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Tables 12, 13, 14, and 15.

[5949] Table 12 below depicts the expression of 56939 mRNA in a panel of normal and tumor human tissues using TaqMan analysis. Elevated expression of 56939 was found in the following tissues: kidney, brain cortex, brain hypothalamus, dorsal root ganglion, colon tumor, and liver cells, including normal liver and liver fibrosis. Lower levels of expression were detected in normal and CHF heart, skeletal muscle, normal adipose tissue, pancreas, primary osteoclasts, normal skin, normal breast, breast tumor, normal ovary, ovary tumor, normal prostate, prostate tumor, prostate epithelial cells, normal colon, colon tumor, normal lung, lung tumor, and skin decubitus. Elevated levels of 56939 can be observed in many tumor samples as compared to their normal tissue counterparts TABLE 12 Tissue Expression Artery normal 0 Vein normal 0 Aortic smooth muscle cells (EARLY) 0.186 Coronary smooth muscle cells (SMC) 0 Static human umbilical vein endothelial cells (HUVEC) 0 Shear HUVEC 0 Heart normal 0.3785 Heart (congestive heart failure) 1.273 Kidney 8.1867 Skeletal Muscle 0.9064 Adipose normal 2.4594 Pancreas 1.6624 primary osteoblasts 0.6636 Osteoclasts (differentiated) 1.2425 Skin normal 0.7973 Spinal cord normal 0.2144 Brain Cortex normal 28.8057 Brain Hypothalamus normal 5.4766 Nerve 0 DRG (Dorsal Root Ganglion) 3.4065 Glial Cells (Astrocytes) 0.6659 Glioblastoma 0 Breast normal 0.4824 Breast tumor 1.0093 Ovary normal 0.4071 Ovary Tumor 0.7465 Prostate Normal 0.2695 Prostate Tumor 0.3932 Epithelial Cells (Prostate) 0.6387 Colon normal 0.1021 Colon Tumor 2.5996 Lung normal 0.1244 Lung tumor 0.3694 Lung COPD 0.1608 Colon IBD 0.0226 Liver normal 5.5147 Liver fibrosis 17.9795 Dermal Cells- fibroblasts 0.0231 Spleen normal 0 Tonsil normal 0.0107 Lymph node 0 Small intestine 0.2497 Skin-Decubitus 0.7465 Synovium 0 BM-MNC (Bone marrow mononuclear cells) 0 Activated PB-MC 0.0584

[5950] Table 13 below depicts the expression of 56939 mRNA in a panel of normal and diseased vessel tissue using TaqMan analysis. Elevated expression of 56939 was found in adipose tissue. Lower levels of expression were detected in normal carotid artery, normal muscular artery, diseased iliac artery, diseased tibial artery, diseased aorta, normal saphenous vein, diseased saphenous vein, normal vein, normal coronary artery, HUVEC, and HAEC. TABLE 13 Tissue Expression Aortic SMC 0.07 Human microvascular endothelial cells (HMVEC) 0.03 Human/Adipose 4.68 Human/Artery/Normal/Carotid 0.29 Human/Artery/Normal/Carotid 0.30 Human/Artery/Normal/Muscular 0.15 H Human/Artery/Diseased/iliac 0.12 Human/Artery/Diseased/Tibial 0.08 Human/Aorta/Diseased 0.20 Human/Vein/Normal/Saphenous 0.05 Human/Vein/Normal/Saphenous 0.08 Human/Vein/Normal/Saphenous 0.07 Human/Vein/Normal/Saphenous 0.16 Human/Vein/Diseased/Saphenous 0.07 Human/Vein/Normal 0.13 Human/Vein/Normal/Saphenous 0.09 Human/Vein/Normal 0.00 Mouse/Artery/Normal/Coronary 0.18 Mouse/Artery/Normal/Coronary 0.24 Mouse/Artery/Normal/Coronary 0.24 Mouse/Artery/Normal/Coronary 0.32 Mouse/Vein/Normal 0.10 HUVEC Vehicle 0.14 HUVECM 0.08 Human amniotic epithelial cells (HAEC) Vehicle 0.18 HAECM 0.07

[5951] Table 14 below depicts the expression of 56939 mRNA in a panel of normal and diseased liver, heart, kidney, and skeletal muscle tissue, using TaqMan analysis. The panel also includes the expression of 56939 mRNA in biopsied monkey liver following feeding of the indicated diet (e.g., a standard chow diet (CHOW), diets containing polyunsaturated fatty acids (POLY), and diets containing both polyunsaturated fatty acids and cholesterol (CHOL)). Elevated expression of 56939 mRNA can be observed in normal kidney. TABLE 14 Tissue Expression Normal Heart 0.20 Normal Kidney 1.45 Normal Skeletal Muscle 0.25 Normal Liver 0.17 Normal Liver 0.15 Normal Liver 0.31 Normal Liver 0.06 Normal Liver 0.80 Normal Liver 0.35 Normal Liver 0.03 Diseased Liver 0.04 Diseased Liver 0.15 Diseased Liver 0.12 Diseased Liver 0.69 Monkey Liver (Chow Diet) 0.38 Monkey Liver (Poly Diet W/out CHOL)) 0.27 Monkey Liver (Poly Diet W/CHOL) 0.28 Monkey Liver (Chow Diet) 0.21 Monkey Liver (Sat Diet W/out CHOL) 7.64 Monkey Liver (Sat Diet W/CHOL) 0.20 CHOW DIET 0.12 POLY DIET W/OUT CHOL 0.09 POLY DIET W/CHOL 0.11 CHOW DIET 0.07 POLY DIET W/OUT CHOL 0.05 POLY DIET W/CHOL 0.06 CHOW DIET 0.04 POLY DIET W/OUT CHOL 0.14 POLY DIET W/CHOL 0.12 CHOW DIET 0.06 POLY DIET W/OUT CHOL 0.09 POLY DIET W/CHOL 0.07 CHOW DIET 0.08 POLY DIET W/CHOL 0.20 CHOW DIET 0.10 POLY DIET W/OUT CHOL 0.19 POLY DIET W/CHOL 0.14 CHOW DIET 0.07 POY DIET W/OUT CHOL 0.15 POLY DIET W/CHOL 0.22 CHOW DIET 0.28 POLY DIET W/OUT CHOL 0.30 POLY DIET W/CHOL 0.34 Liver 1.62 Liver 0.38 Liver 0.08 Liver 0.28

[5952] Table 15 below depicts the expression of 56939 mRNA in a panel of biopsied monkey livers following feeding of the indicated diet (i.e., a standard chow diet (CHOW), and diets containing polyunsaturated fatty acids (POLY) with and without cholesterol (CHOL), and diets containing monounsaturated fatty acids (MONO) with and without cholesterol (CHOL)), using TaqMan analysis. TABLE 15 Tissue Expression CHOW DIET 0.256 POLY DIET W/OUT CHOL 0.178 POLY DIET W/CHOL 0.297 CHOW DIET 0.309 POLY DIET W/OUT CHOL 0.283 POLY DIET W/CHOL 0.064 CHOW DIET 0.066 POLY DIET W/OUT CHOL 0.202 POLY DIET W/CHOL 0.253 CHOW DIET 0.196 POLY DIET W/OUT CHOL 0.315 POLY DIET W/CHOL 0.062 CHOW DIET 0.069 POLY DIET W/OUT CHOL 0.115 POLY DIET W/CHOL 0.352 CHOW DIET 0.268 POLY DIET W/OUT CHOL 0.378 POLY DIET W/CHOL 0.090 CHOW DIET 0.061 POY DIET W/OUT CHOL 0.133 POLY DIET W/CHOL 0.147 CHOW DIET 0.262 POLY DIET W/OUT CHOL 0.262 POLY DIET W/CHOL 0.078 CHOW DIET 0.035 MONO DIET W/OUT CHOL 0.079 MONO DIET W/CHOL 0.187 CHOW DIET 0.165 MONO DIET W/OUT CHOL 0.132 MONO DIET W/CHOL 0.053 CHOW DIET 0.034 MONO DIET W/OUT CHOL 0.050 MONO DIET W/CHOL 0.063 CHOW DIET 0.445 MONO DIET W/OU CHOL 0.321 MONO DIET W/CHOL 0.046 CHOW DIET 0.113 MONO DIET W/OUT CHOL 0.111 MONO DIET W/CHOL 0.116 CHOW DIET 0.217 MONO DIET W/OUT CHOL 0.094 MONO DIET W/CHOL 0.050 CHOW DIET 0.136 MONO DIET W/OUT CHOL 0.176 MONO DIET W/CHOL 0.266 CHOW DIET 0.435 MONO DIET W/CHOL 0.535

Example 36 Tissue Distribution of 56939 mRNA by Northern Analysis

[5953] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2×SSC at 65° C. A DNA probe corresponding to all or a portion of the 56939 cDNA (SEQ ID NO:48) can be used. The DNA was radioactively labeled with ³²P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 37 Recombinant Expression of 56939 in Bacterial Cells

[5954] In this example, 56939 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 56939 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-56939 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 38 Expression of Recombinant 56939 Protein in COS Cells

[5955] To express the 56939 gene in COS cells (e.g., COS-7 cells, CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182), the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 56939 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5956] To construct the plasmid, the 56939 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 56939 coding sequence starting from the initiation codon; the 3′end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 56939 coding sequence. The PCR amplified fragment and the pCDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 56939_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5α, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5957] COS cells are subsequently transfected with the 56939-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 56939 polypeptide is detected by radiolabelling (³⁵S-methionine or ³⁵S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with ³⁵S-methionine (or ³⁵S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5958] Alternatively, DNA containing the 56939 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 56939 polypeptide is detected by radiolabelling and immunoprecipitation using a 56939 specific monoclonal antibody.

Examples for 33410 Example 39 Identification and Characterization of Human 33410 cDNA

[5959] The human 33410 sequence (FIG. 26; SEQ ID NO:53), which is approximately 4667 nucleotides long, including untranslated regions, contains a predicted methionine-initiated coding sequence of about 2508 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:53; SEQ ID NO:55). The coding sequence encodes a 835 amino acid protein (SEQ ID NO:54).

[5960] The location of the initiation and termination codons is indicated by the underline. The human 33410 nucleic acid sequence is recited as follows: GGCACGAGGCAACTTGGTCTGAATTCCAGGTCACTAACCACTTGTCTCTTCTGTT (SEQ ID NO: 53) TCCCCATTCCTTTCTGTCTGCCCCATCCAATTTCCTTTGCCCTCTTCCACCTCTGTA TTTTTCTGTCTGTCCGTCTGTCTGTATCCTGCCTCCCTGCCCCTCTCGCTCCACCCC CCGCAGGTCGGGCCTGCCTTCACCTTCTCCCACTTCCTTCCCCTTCCCCACCCCGT GCCCCCTCCATGGAGAGGAACAGACCCCTTCTCTGTCCAGTCTAACCCAGGTCCC TCCCCAACCCCCTCCTCCCTCCTTTCCCCCCGCCCCTCCTCCCTCCTGGGGCGAGG GGGGCCTCCCTCCCTCTCCCCCCCTTCTCTCTCTCTCCGAGGGGGGGGGGTCCCA GGGAGGGAGGGGGGGTCCCCCGATCAGCATGTGGCTCCTGGCGCTGTGTCTGGT GGGGCTGGCGGGGGCTCAACGCGGGGGAGGGGGTCCCGGCGGCGGCGCCCCGG GCGGCCCCGGCCTGGGCCTCGGCAGCCTCGGCGAGGAGCGCTTCCCGGTGGTGA ACACGGCCTACGGGCGAGTGCGCGGTGTGCGGCGCGAGCTCAACAACGAGATCC TGGGCCCCGTCGTGCAGTTCTTGGGCGTGCCCTACGCCACGCCGCCCCTGGGCGC CCGCCGCTTCCAGCCGCCTGAGGCGCCCGCCTCGTGGCCCGGCGTGCGCAACGC CACCACCCTGCCGCCCGCCTGCCCGCAGAACCTGCACGGGGCGCTGCCCGCCAT CATGCTGCCTGTGTGGTTCACCGACAACTTGGAGGCGGCCGCCACCTACGTGCA GAACCAGAGCGAGGACTGCCTGTACCTCAACCTCTACGTGCCCACCGAGGACGG TCCGCTCACAAAAAAACGTGACGAGGCGACGCTCAATCCGCCAGACACAGATAT CCGTGACCCTGGGAAGAAGCCTGTGATGCTGTTTCTCCATGGCGGCTCCTACATG GAGGGGACCGGAAACATGTTCGATGGCTCAGTCCTGGCTGCCTATGGCAACGTC ATTGTAGCCACGCTCAACTACCGTCTTGGGGTGCTCGGTTTTCTCAGCACCGGGG ACCAGGCTGCAAAAGGCAACTATGGGCTCCTGGACCAGATCCAGGCCCTGCGCT GGCTCAGTGAAAACATCGCCCACTTTGGGGGCGACCCCGAGCGTATCACCATCT TTGGTTCCGGGGCAGGGGCCTCCTGCGTCAACCTTCTGATCCTCTCCCACCATTC AGAAGGGCTGTTCCAGAAGGCCATCGCCCAGAGTGGCACCGCCATTTCCAGCTG GTCTGTCAACTACCAGCCGCTCAAGTACACGCGGCTGCTGGCAGCCAAGGTGGG CTGTGACCGAGAGGACAGTGCTGAAGCTGTGGAGTGTCTGCGCCGGAAGCCCTC CCGGGAGCTGGTGGACCAGGACGTGCAGCCTGCCCGCTACCACATCGCCTTTGG GCCCGTGGTGGATGGCGACGTGGTCCCCGATGACCCTGAGATCCTCATGCAGCA GGGAGAATTCCTCAACTACGACATGCTCATCGGCGTCAACCAGGGAGAGGGCCT CAAGTTCGTGGAGGACTCTGCAGAGAGCGAGGACGGTGTGTCTGCCAGCGCCTT TGACTTCACTGTCTCCAACTTTGTGGACAACCTGTATGGCTACCCGGAAGGCAAG GATGTGCTTCGGGAGACCATCAAGTTTATGTACACAGACTGGGCCGACCGGGAC AATGGCGAAATGCGCCGCAAAACCCTGCTGGCGCTCTTTACTGACCACCAATGG GTGGCACCAGCTGTGGCCACTGCCAAGCTGCACGCCGACTACCAGTCTCCCGTCT ACTTTTACACCTTCTACCACCACTGCCAGGCGGAGGGCCGGCCTGAGTGGGCAG ATGCGGCGCACGGGGATGAACTGCCCTATGTCTTTGGCGTGCCCATGGTGGGTGC CACCGACCTCTTCCCCTGTAACTTCTCCAAGAATGACGTCATGCTCAGTGCCGTG GTCATGACCTACTGGACCAACTTCGCCAAGACTGGGGACCCCAACCAGCCGGTG CCGCAGGATACCAAGTTCATCCACACCAAGCCCAATCGCTTCGAGGAGGTGGTG TGGAGCAAATTCAACAGCAAGGAGAAGCAGTATCTGCACATAGGCCTGAAGCCA CGCGTGCGTGACAACTACCGCGCCAACAAGGTGGCCTTCTGGCTGGAGCTCGTG CCCCACCTGCACAACCTGCACACGGAGCTCTTCACCACCACCACGCGCCTGCCTC CCTACGCCACGCGCTGGCCGCCTCGTCCCCCCGCTGGCGCCCCGGGCACACGCC GGCCCCCGCCGCCTGCCACCCTGCCTCCCGAGCCCGAGCCCGAGCCCGGCCCAA GGGCCTATGACCGCTTCCCCGGGGACTCACGGGACTACTCCACGGAGCTGAGCG TCACCGTGGCCGTGGGTGCCTCCCTCCTCTTCCTCAACATCCTGGCCTTTGCTGCC CTCTACTACAAGCGGGACCGGCGGCAGGAGCTGCGGTGCAGGCGGCTTAGCCCA CCTGGCGGCTCAGGCTCTGGCGTGCCTGGTGGGGGCCCCCTGCTCCCCGCCGCGG GCCGTGAGCTGCCACCAGAGGAGGAGCTGGTGTCACTGCAGCTGAAGCGGGGTG GTGGCGTCGGGGCGGACCCTGCCGAGGCTCTGCGCCCTGCCTGCCCGCCCGACT ACACCCTGGCCCTGCGCCGGGCACCGGACGATGTGCCTCTCTTGGCCCCCGGGG CCCTGACCCTGCTGCCCAGTGGCCTGGGGCCACCGCCACCCCCACCGCCCCCCTC CCTTCATCCCTTCGGGCCCTTCCCCCCGCCCCCTCCCACCGCCACCAGCCACAAC AACACGCTACCCCACCCCCACTCCACCACTCGGGTATAGGGGGTGGGTGGGGAG GCCCTCCTCCCCGGCCCTCCCTGGCCCGGCCACTCCGAAGGCAGGGAGGAGGAC TTGGCAACTGGCTTTTCTCCTGTGGAGTCGTCACACGCCATCCAGCAGCGCTAAG GTGGACATGGGATTCCTCCCTGCGATGCGTGTCTTTCCCACGCAGAGAAGCCCCA GTCTCTTCTCTGGATCTGGGCCTTTGAACAACTGGGGGGCGTTTTCTCCCCCCCAT TGGGACACCAGTCTTCGGTGTGTGGAATGTGGTATTTTCCCGCGTGGAGGTGTGC TTTCTCACAACGGGGTGTGTTTTCCCATGTGCAGGGTGAGGTTTTTTTTTGCCACC CTGGACACATGTTGGCCCCCTCAAAGAATTTCTGTGGGGATTTGTACCCCAGAAT CCTGTTCCCCCATCCCTTCTCCCACCTCCTCCCCTCTCCCTCCCCCTGGAGACCCT GGAAGTGGTGTGTTCACATACAGTGACCCTTGGCCACCAGACCACAGAGGATGG AGCCTGGGAAGCAGCGAGGAAATCACAGCCCCCTCGCCCCTGCCTCCCTTGCCC CTACCCCGGCGAAGCATGTTCCCCCCGACGCCCCCCTTGGCACAAGTCAGATGA AGCACGTTCTGCCGGGGAGGCCCTCACCTTCCAGAGAGGACAGACACAGATTTC CTGCTGGGGGAGGGAGGAGTCCACGCATCCTGATGCTGCCTGGAAGCTTATTTTC CCGTGGCCAGGACGCATTTCTCTGAGTGGAAACAGGTTCTTGCATGTGGATGTGT GTTTCCCCAGGCAGACGGCCCCTCTCTTCCCAGCACTTCCCTGCCTCCCCCAGGC CTCAGGCCCAGCACCCAGTTCCTCCTCACATGGCAGGTGAGCACAGACTTCTAGT TGGCAGGAGCTGAGGAGGGTGAACAAACCCCGAGGGAGGCCCGGCCCTTGCTCC CGAGTTGGGGGGAGGGGGTGTGGCAACGTGCCCCCCGCAGAGGCCACGCATGTT TGACCAAAGCCCTCATTGTGGTCCGAGGACAGCCTTTTCCCCAGGCCTCAGAGCA TTGCTCATCCGTGCCAAACTGGGTAGGTGGATTTGAGCGGAAAGACTCCCAAAA TGTGCCAAGAATTTCCCAGTCCCAGGCAGGGCAGGGGAAACTAAGGGCAAGCA GGATACAGGGCGAGGGATGTGGCAGGTGAGGGGGCTCCCGCCTGTGCCCCTTCT CCTCACCATGTCTCCCCCACCCTGCCTCAGTTCTCCGTTCCCCTTCATCTCCGTCC CCCTCTTTGAAGCTGTCCCCATCTCAGTGTCAGACCAGCCTTCTCCTCATCTGACC ACCCTCCTCTGACCGACGCCCCCTCCTTGTCTGAAAGAAAGGAGCCTTGAATGGT GGAGGGAGGCAGTGGGGAGAAAGGTCTCACCGGACAGGTTGGGAGAATGAGGT CAGCGGTGCTGGGGAACAGATGGAGGGGGCAGTGGGGACAGGGCTTGGGCAGA CACCAGCAGGAATAATTTGAAATGTGTGAGGTGACTCCCCGGAGGGCCTTGGGC TTGGGCATTTGGGAAAAGAATGATGTCTGGAAGGGCTTAAGGGACACAGTGGAC GAGGGGAGAGTCCTCATCTGCTGGCATTTTGTGGGGTGTTAGTGCCAAACTTGAA TAGGGGCTGGGGTGCTGTCTTCCACTGACACCCAAATCCAGAATCCCTGGTCTTG AGTCCCAGAACTTTGCCTCTTGACTGTCCCTC.

[5961] The coding sequence encodes a 835 amino acid protein (SEQ ID NO:54) and has the following amino acid sequence: MWLLALCLVGLAGAQRGGGGPGGGAPGGPGLGLGSLGEERFPVVNTAYGRVRGV (SEQ ID NO: 54) RRELNNEILGPVVQFLGVPYATPPLGARRFQPPEAPASWPGVRNATTLPPACPQNLH GALPAIMLPVWFTDNLEAAATYVQNQSEDCLYLNLYVPTEDGPLTKKRDEATLNPP DTDIRDPGKKLPVMLFLHGGSYMEGTGNMFDGSVLAAYGNVIVATLNYRLGVLGFL STGDQAAKGNYGLLDQIQALRWLSENIAHFGGDPERITIFGSGAGASCVNLLILSHHS EGLFQKAIAQSGTAISSWSVNYQPLKYTRLLAAKVGCDREDSAEAVECLRRKPSREI VDQDVQPARYHIAFGPVVDGDVVPDDPEILMQQGEFLNYDMLIGVNQGEGLKFVED SAESEDGVSASAFDFTVSNFVDNLYGYPEGKDVLRETIKFMYTDWADRDNGEMRR KTLLALFTDHQWVAPAVATAKLHADYQSPVYFYTFYHHCQAEGRPEWADAAHGD ELPYVFGVPMVGATDLFPCNFSKNDVMLSAVVMTYWTNFAKTGDPNQPVPQDTKF IHTKPNRFEEVVWSKFNSKEKQYLHIGLKPRVRDNYRANKVAFWLELVPHLHNLHT ELFTTTTRLPPYATRWPPRPPAGAPGTRRPPPPATLPPEPEPEPGPRAYDRFPGDSRDY STELSVTVAVGASLLFLNILAFAALYYKRDRRQELRCRRLSPPGGSGSGVPGGGPLLP AAGRELPPEEELVSLQLKRGGGVGADPAEALRPACPPDYTLALRRAPDDVPLLAPG ALTLLPSGLGPPPPPPPPSLHPFGPFPPPPPTATSHNNTLPHPHSTTRV.

Example 40 Tissue Distribution of 33410 mRNA by Large-Scale Tissue-Specific Library Sequencing and by Northern Blot Hybridization

[5962] Endogenous human 33410 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) that has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a way of quantitating the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe that has been labeled with a different fluorophore on the 5′end (typically VIC) and is directed toward, for example, a housekeeping gene such as GAPDH.

[5963] To determine the level of 33410 in various human tissues a primer/probe set was designed using Primer Express (Perkin-Elmer) software and primary cDNA sequence information. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from one μg total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction.

[5964] Tissues tested included the human tissues shown in the Table below, including ovary, kidney, spleen, lung, liver, and colon, among others. Expression was found primarily in normal brain cortex, with expression to a lesser extent in normal artery, normal heart, normal kidney, normal spinal cord, normal brain hypothalamus, and normal ovary (Table below). Expression of 33410 RNA in a human liver panel using TaqMan analysis with the following tissues were performed: heart, kidney, skeletal muscle, and several samples of normal and diseased liver. Elevated expression was detected in heart tissue. TABLE 16 Expression Data Tissue Type Relative Expression Artery normal 2.4894 Aorta diseased 0 Vein normal 0 Coronary SMC 0 HUVEC 12.9581 Hemangioma 0.94 Heart normal 0.398 Heart CHF 0 Kidney 0.2608 Skeletal Muscle 0 Adipose normal 0 Pancreas 0 primary osteoblasts 0 Osteoclasts (diff) 0 Skin normal 0 Spinal cord normal 1.3621 Brain Cortex normal 68.8691 Brain Hypothalamus normal 9.3553 Nerve 0 DRG (Dorsal Root Ganglion) 0 Breast normal 0 Breast tumor 0 Ovary normal 1.4957 Ovary tumor 0 Prostate normal 0 Prostate tumor 0 Salivary glands 0 Colon normal 0 Colon tumor 0 Lung normal 0 Lung tumor 0 Lung COPD 0 Colon IBD 0 Liver normal 0 Liver fibrosis 0 Spleen normal 0 Tonsil normal 0 Lymph node normal 0 Small intestine normal 0 Macrophages 0 Synovium 0 BM-MNC 0 Activated PBMC 0 Neutrophils 0 Megakaryocytes 0 Erythroid 0 positive control 11.3592

[5965] Table 16 depicts the expression of 33410 RNA in a panel of normal and tumor human tissues detecting using TaqMan analysis. Elevated expression was detected in normal artery, HUVEC, hemangioma, normal heart, kidney, normal spinal cord, normal brain cortex, brain hypothalamus and normal ovary.

[5966] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2× SSC at 65° C. A DNA probe corresponding to all or a portion of the 33410 cDNA (SEQ ID NO:53) can be used. The DNA was radioactively labeled with ³²P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 41 Recombinant Expression of 33410 in Bacterial Cells

[5967] In this example, 33410 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 33410 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-33410 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB 199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 42 Expression of Recombinant 33410 Protein in COS Cells

[5968] To express the 33410 gene in COS cells, the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 33410 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5969] To construct the plasmid, the 33410 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 33410 coding sequence starting from the initiation codon; the 3′end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 33410 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably, the two restriction sites chosen are different so that the 33410 gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5α, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5970] COS cells are subsequently transfected with the 33410-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989. The expression of the 33410 polypeptide is detected by radiolabeling (³⁵S-methionine or ³⁵S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1988) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with ³⁵S-methionine (or ³⁵S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5971] Alternatively, DNA containing the 33410 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 33410 polypeptide is detected by radiolabeling and immunoprecipitation using a 33410 specific monoclonal antibody.

Examples for 33521 Example 43 Identification and Characterization of Human 33521 cDNA

[5972] The human 33521 nucleic acid sequence is recited as follows: GCACTGTGACAAGCTGCACGCTCTAGAGTCGACCCAGCAGGCTCTGTGTATGAA (SEQ ID NO: 61) TGACAAGGATACCTTCAGCCAGCTCATTCTGGATGAATGAATGATTACACTAAGT GTCCTCCACATTCCTCTGTGGGCTCACTTCATGGACTCACTTTGCGTGCTTGTTAA ATGTGCTGCGTTGCTCCCAAGACCATGTAAAGCCTACTGACCACTAACCTCCCTC ACAGCAGAAACTAGACGTCAGGTTAAAATGGGCAACTCCGACAGTCAGTACACC CTTCAAGGATCTAAAAATCATAGCAATACTATTACTGGTGCTAAGCAAATTCCTT GCTCCCTGAAAATACGTGGCGTTCATGCAAAAGAGGAAAAGTCATTGCATGGAT GGGGTCACGGGAGCAACGGAGCAGGTTACAAGTCCAGGTCCCTGGCCCGAAGCT GCCTTTCTCACTTTAAGAGTAACCAGCCTTACGCATCGAGACTCGGTGGCCCCAC ATGCAAGGTCTCCAGAGGTGTTGCCTACTCCACGCACAGGACAAATGCCCCAGG GAAGGATTTCCAGGGCATCAGTGCTGCTTTCTCAACTGAGAATGGCTTCCATTCT GTTGGCCACGAGCTGGCAGATAACCACATCACCTCCAGAGACTGCAACGGACAC CTTCTCAACTGCTACGGGAGGAATGAGAGCATTGCCTCCACCCCACCGGGCGAA GACCGCAAGAGCCCCCGAGTGCTCATCAAAACGCTGGGGAAGCCGGATGGGTGT TTAAGGGTCGAGTTCCACAATGGTGGCAACCCCAGCAAAGTGCCTGCAGAGGAC TGCAGTGAGCCGGTGCAGCTGCTGAGGTACTCACCTACCTTAGCATCGGAAACC TCCCCTGTGCCTGAAGCCAGGAGGGGGTCCAGCGCCGATTCCCTGCCCAGCCAT CGCCCCTCTCCCACGGACTCTCGCCTGCGGTCCAGCAAAGGCAGCTCCCTGAGTT CTGAGTCATCCTGGTACGACTCCCCTTGGGGCAATGCTGGAGAGCTGAGCGAGG CTGAGGGCTCCTTCCTGGCCCCCGGCATGCCTGACCCCAGTCTCCATGCCAGCTT CCCACCTGGCGATGCCAAAAAGCCTTTCAACCAAAGCTCTTCCCTCTCCTCCCTC CGGGAACTGTACAAAGATGCCAACCTGGGGAGCCTCTCCCCCTCAGGTATCCGC CTTTCTGATGAATACATGGGCACGCATGCCAGCCTGAGCAACCGTGTCTCTTTTG CTTCCGACATTGATGTGCCCTCCAGAGTGGCACACGGGGACCCCATCCAGTACA GTTCCTTCACTCTCCCCTGTCGGAAGCCCAAAGCCTTTGTTGAGGATACTGCGAA GAAGGACTCCCTCAAAGCCAGGATGCGACGGATCAGTGACTGGACGGGAAGCCT CTCAAGGAAGAAAAGGAAACTCCAGGAGCCGAGGTCCAAGGAGGGCAGTGACT ACTTTGACAGTCGCTCTGATGGACTGAATACAGATGTGCAGGGATCCTCCCAGGC ATCTGCTTTTCTGTGGTCAGGGGGCTCTACTCAGATCCTGTCTCAGAGAAGTGAA TCCACACATGCGATTGGCAGCGATCCCCTCCGGCAGAACATTTATGAGAATTTCA TGCGAGAGTTGGAAATGAGCAGGACCAACACTGAGAACATAGAAACATCTACA GAAACCGCCGAGTCCAGCAGCGAGTCACTCAGCTCTCTGGAACAGCTGGATCTG CTCTTCGAGAAGGAACAGGGGGTGGTCCGGAGGGCCGGGTGGCTCTTCTTCAAG CCCCTGGTCACTGTGCAGAAGGAAAGGAAGCTTGAGCTGGTGGCACGAAGGAAA TGGAAACAGTACTGGGTAACGCTGAAAGGATGCACGCTGCTGTTTTATGAGACC TATGGGAAGAATTCCATGGATCAGAGCAGTGCCCCTCGGTGTGCTCTGTTTGCAG AAGACAGCATAGTGCAGTCTGTTCCAGAGCATCCCAAGAAAGAAAATGTGTTCT GCCTCAGCAACTCCTTTGGAGATGTCTACCTTTTCCAGGCCACCAGCCAGACAGA TCTAGAAAACTGGGTCACTGCTGTACACTCTGCTTGTGCATCCCTTTTTGCAAAG AAGCATGGGAAAGAGGACACGCTGCGGCTGCTGAAGAACCAGACCAAAAACCT GCTTCAGAAGATAGACATGGACAGCAAGATGAAGAAGATGGCAGAGCTGCAGC TGTCCGTGGTGAGCGACCCAAAGAACAGGAAAGCCATAGAGAACCAGATCCAG CAATGGGAGCAGAATCTTGAGAAATTTCACATGGATCTGTTCAGGATGCGCTGCT ATCTGGCCAGCCTACAAGGTGGGGAGTTACCGAACCCAAAGAGTCTCCTTGCAG CCGCCAGCCGCCCCTCCAAGCTGGCCCTCGGCAGGCTGGGCATCTTGTCTGTTTC CTCTTTCCATGCTCTGGTATGTTCTAGAGATGACTCTGCTCTCCGGAAAAGGACA CTGTCACTGACCCAGCGAGGGAGAAACAAGAAGGGAATATTTTCTTCGTTAAAA GGGCTGGACACACTGGCCAGAAAAGGCAAGGAGAAGAGACCTTCTATAACTCA GGTCGATGAACTTCTGCATATATATGGTTCAACAGTGGACGGTGTTCCCCGAGAC AATGCATGGGAAATCCAGACTTATGTTCACTTTCAGGACAATCACGGAGTTACTG TAGGGATCAAGCCAGAGCACAGAGTAGAAGATATTTTGACTTTGGCATGCAAGA TGAGGCAGTTGGAACCCAGCCATTATGGCCTACAGCTTCGAAAATTAGTAGATG ACAATGTTGAGTATTGCATCCCTGCACCATATGAATATATGCAACAACAGGTTTA TGATGAAATAGAAGTCTTTCCACTAAATGTTTATGATGTGCAGCTCACGAAGACT GGGAGTGTGTGTGACTTTGGGTTTGCAGTTACAGCGCAGGTGGATGAGCGTCAG CATCTCAGCCGGATATTTATAAGCGACGTTCTTCCCGATGGCCTGGCGTATGGGG AAGGGCTGAGAAAGGGCAATGAGATCATGACCTTAAATGGGGAAGCTGTGTCTG ATCTTGACCTTAAGCAGATGGAGGCCCTGTTTTCTGAGAAGAGCGTCGGACTCAC TCTGATTGCCCGGCCTCCGGACACAAAAGCAACCCTGTGTACATCCTGGTCAGAC AGTGACCTGTTCTCCAGGGACCAGAAGAGTCTGCTGCCCCCTCCTAACCAGTCCC AACTGCTGGAGGAATTCCTGGATAACTTTAAAAAGAATACAGCCAATGATTTCA GCAACGTCCCTGATATCACAACAGGTCTGAAAAGGAGTCAGACAGATGGCACTC TGGATCAGGTTTCCCACAGGGAGAAAATGGAGCAGACATTCAGGAGTGCTGAGC AGATCACTGCACTGTGCAGGAGTTTTAACGACAGTCAGGCCAACGGCATGGAAG GACCGCGGGAGAATCAGGATCCTCCTCCGAGGCCTCTGGCCCGCCACCTGTCTG ATGCAGACCGCCTCCGCAAAGTCATCCAGGAGCTTGTGGACACAGAGAAGTCCT ACGTGAAGGATTTGAGCTGCCTCTTTGAATTATACTTGGAGCCACTTCAGAATGA GACCTTTCTTACCCAAGATGAGATGGAGTCACTTTTTGGAAGTTTGCCAGAGATG CTTGAGTTTCAGAAGGTGTTTCTGGAGACCCTGGAGGATGGGATTTCAGCATCAT CTGACTTTAACACCCTAGAAACCCCCTCACAGTTTAGAAAATTACTGTTTTCCCTT GGAGGCTCTTTCCTTTATTACGCGGACCACTTTAAACTGTACAGTGGATTCTGTG CTAACCATATCAAAGTACAGAAGGTTCTGGAGCGAGCTAAAACTGACAAAGCCT TCAAGGCTTTTCTGGATGCCCGGAACCCCACCAAGCAGCATTCCTCCACGCTGGA GTCCTACCTCATCAAGCCGGTTCAGAGAGTGCTCAAGTACCCGCTGCTGCTCAAG GAGCTGGTGTCCCTGACGGACCAGGAGAGCGAGGAGCACTACCACCTGACGGA AGCACTAAAGGCAATGGAGAAAGTAGCGAGCCACATCAATGAGATGCAGAAGA TCTATGAGGATTATGGGACCGTGTTTGACCAGCTAGTAGCTGAGCAGAGCGGAA CAGAGAAGGAGGTAACAGAACTTTCGATGGGAGAGCTTCTGATGCACTCTACGG TTTCCTGGTTGAATCCATTTCTGTCTCTAGGAAAAGCTAGAAAGGACCTTGAGCT CACAGTATTTGTTTTTAAGAGAGCCGTCATACTGGTTTATAAAGAAAACTGCAAA CTGAAAAAGAAATTGCCCTCGAATTCCCGGCCTGCACACAACTCTACTGACTTGG ACCCATTTAAATTCCGCTGGTTGATCCCCATCTCCGCGCTTCAAGTCAGACTGGG GAATCCAGCAGGGACAGAAAATAATTCCATATGGGAACTGATCCATACGAAGTC AGAAATAGAAGGACGGCCAGAAACCATCTTTCAGTTGTGTTGCAGTGACAGTGA AAGCAAAACCAACATTGTTAAGGTGATTCGTTCTATTCTGAGGGAAAACTTCAG GCGTCACATAAAGTGTGAATTACCACTGGAGAAAACGTGTAAGGATCGCCTGGT ACCTCTTAAGAACCGAGTTCCTGTTTCGGCCAAATTAGCTTCATCCAGGTCTTTA AAAGTCCTGAAGAATTCCTCCAGCAACGAGTGGACCGGTGAGACTGGCAAGGGA ACCTTGCTGGACTCTGACGAGGGCAGCTTGAGCAGCGGCACCCAGAGCAGCGGC TGCCCCACGGCTGAGGGCAGGCAGGACTCCAAGAGCACTTCTCCCGGGAAATAC CCACACCCCGGCTTGGCAGATTTTGCCGACAATCTCATCAAAGAGAGTGACATC CTGAGCGATGAAGATGATGACCACCGTCAGACTGTGAAGCAGGGCAGCCCTACT AAAGACATCGAAATTCAGTTCCAGAGACTGAGGATTTCCGAGGACCCAGACGTT CACCCCGAGGCTGAGCAGCAGCCTGGCCCGGAGTCGGGTGAGGGTCAGAAAGG AGGAGAGCAGCCCAAACTGGTCCGGGGGCACTTCTGCCCCATTAAACGAAAAGC CAACAGCACCAAGAGGGACAGAGGAACTTTGCTCAAGGCGCAGATCCGTCACCA GTCCCTTGACAGTCAGTCTGAAAATGCCACCATCGACCTAAATTCTGTTCTAGAG CGAGAATTCAGTGTCCAGAGTTTAACATCTGTTGTCAGTGAGGAGTGTTTTTATG AAACAGAGAGCCACGGAAAATCATAGTATGATTCAATCCAGATATGGGTTAAAT TCCTCATTTTACTTTTAAACTGGTGGTAAAGTGGAAATTGCAAAAAAAAAAAAAAA.

[5973] The human 33521 sequence (FIG. 30; SEQ ID NO:61) is approximately 5437 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TAG) which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 5106 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:61; SEQ ID NO:63). The coding sequence encodes a 1701 amino acid protein (SEQ ID NO:62), which is recited as follows: MGNSDSQYTLQGSKNHSNTITGAKQIPCSLKIRGVHAKEEKSLHGWGHGSNGAGYK (SEQ ID NO: 62) SRSLARSCLSHFKSNQPYASRLGGPTCKVSRGVAYSTHRTNAPGKDFQGISAAFSTE NGFHSVGHELADNHITSRDCNGHLLNCYGRNESIASTPPGEDRKSPRVLIKTLGKPD GCLRVEFHNGGNPSKVPAEDCSEPVQLLRYSPTLASETSPVPEARRGSSADSLPSHRP SPTDSRLRSSKGSSLSSESSWYDSPWGNAGELSEAEGSFLAPGMPDPSLHASFPPGDA KKPFNQSSSLSSLRELYKDANLGSLSPSGIRLSDEYMGTHASLSNRVSFASDIDVPSR VAHGDPIQYSSFTLPCRKPKAFVEDTAKKDSLKARMRRISDWTGSLSRKKRKLQEPR SKEGSDYFDSRSDGLNTDVQGSSQASAFLWSGGSTQILSQRSESTHAIGSDPLRQNIY ENFMRELEMSRTNTENIETSTETAESSSESLSSLEQLDLLFEKEQGVVRRAGWLFFKP LVTVQKERKLELVARRKWKQYWVTLKGCTLLFYETYGKNSMDQSSAPRCALFAED SIVQSVPEHPKKENVFCLSNSFGDVYLFQATSQTDLENWVTAVHSACASLFAKKHG KEDTLRLLKNQTKNLLQKIDMDSKMKKMAELQLSVVSDPKNRKAIENQIQQWEQN LEKFHMDLFRMRCYLASLQGGELPNPKSLLAAASRPSKLALGRLGILSVSSFHALVC SRDDSALRKRTLSLTQRGRNKKGIFSSLKGLDTLARKGKEKRPSITQVDELLHIYGST VDGVPRDNAWEIQTYVHFQDNHGVTVGIKPEHRVEDILTLACKMRQLEPSHYGLQL RKLVDDNVEYCIPAPYEYMQQQVYDEIEVFPLNVYDVQLTKTGSVCDFGFAVTAQV DERQHLSRIFISDVLPDGLAYGEGLRKGNEIMTLNGEAVSDLDLKQMEALFSEKSVG LTLIARPPDTKATLCTSWSDSDLFSRDQKSLLPPPNQSQLLEEFLDNFKKNTANDFSN VPDITTGLKRSQTDGTLDQVSHREKMEQTFRSAEQITALCRSFNDSQANGMEGPREN QDPPPRPLARHLSDADRLRKVIQELVDTEKSYVKDLSCLFELYLEPLQNETFLTQDE MESLFGSLPEMLEFQKVFLETLEDGISASSDFNTLETPSQFRKLLFSLGGSFLYYADHF KLYSGFCANHIKVQKVLERAKTDKAFKAFLDARNPTKQHSSTLESYLIKPVQRVLKY PLLLKELVSLTDQESEEHYHLTEALKAMEKVASHINEMQKIYEDYGTVFDQLVAEQS GTEKEVTELSMGELLMHSTVSWLNPFLSLGKARKDLELTVFVFKRAVILVYKENCK LKKKLPSNSRPAHNSTDLDPFKFRWLIPISALQVRLGNPAGTENNSIWELIHTKSEIEG RPETIFQLCCSDSESKTNIVKVIRSILRENFRRHIKCELPLEKTCKDRLVPLKNRVPVSA KLASSRSLKVLKNSSSNEWTGETGKGTLLDSDEGSLSSGTQSSGCPTAEGRQDSKST SPGKYPHPGLADFADNLIKESDILSDEDDDHRQTVKQGSPTKDIEIQFQRLRISEDPDV HPEAEQQPGPESGEGQKGGEQPKLVRGHFCPIKRKANSTKRDRGTLLKAQIRHQSLD SQSENATIDLNSVLEREFSVQSLTSVVSEECFYETESHGKS.

Example 44 Tissue Distribution of 33521 mRNA by TaqMan Analysis

[5974] Endogenous human 33521 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[5975] To determine the level of 33521 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 μg total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Table 17. 33521 mRNA was detected in glioblastoma cells. As compared to their normal tissue counterparts, elevated levels of expression were observed in breast, colon, and lung tumors. TABLE 17 Expression of 33521 in various tissues Tissue Expression Glioblastoma 21.33 Glioblastoma 22.78 Glioblastoma 48.17 Glioblastoma 59.71 Brain Normal 11.67 Placenta Normal 5.58 Skin Normal 9.68 Adipose Normal 1.18 Breast Normal 3.89 Breast Tumor 100.78 Breast Tumor 5.62 Breast Tumor 23.34 Colon Normal 1.13 Colon Tumor 16.91 Colon Tumor 6.73 Colon Tumor 19.56 Liver Normal 1.20 Colon Metastasis 12.21 Colon Metastasis 9.99 Lung Normal 1.00 Lung Tumor 5.98 Lung Tumor 11.24 Lung Tumor 4.08

Example 45 Tissue Distribution of 33521 mRNA by Northern Analysis

[5976] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2× SSC at 65° C. A DNA probe corresponding to all or a portion of the 33521 cDNA (SEQ ID NO:61) can be used. The DNA was radioactively labeled with ³²P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 46 Recombinant Expression of 33521 in Bacterial Cells

[5977] In this example, 33521 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 33521 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-33521 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB 199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 47 Expression of Recombinant 33521 Protein in COS Cells

[5978] To express the 33521 gene in COS cells (e.g., COS-7 cells, CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182), the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 33521 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5979] To construct the plasmid, the 33521 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 33521 coding sequence starting from the initiation codon; the 3′ end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 33521 coding sequence. The PCR amplified fragment and the pCDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 33521_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5α, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5980] COS cells are subsequently transfected with the 33521-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 33521 polypeptide is detected by radiolabelling (³⁵S-methionine or ³⁵S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with ³⁵S-methionine (or ³⁵S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5981] Alternatively, DNA containing the 33521 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 33521 polypeptide is detected by radiolabelling and immunoprecipitation using a 33521 specific monoclonal antibody.

Examples of 23479, 48120, and 46689 Example 48 Identification and Characterization of Human 23479, 48120, or 46689 cDNA

[5982] The human 23479 nucleic acid sequence is recited as follows: TGCTGTCGCTAGATTCAGATGATTCAAGTGAGGATCAAGTGGAAAATAGTAAAA (SEQ ID NO: 74) ATTCCTGGAGTTGCAAGTTTGTTGCTGCTGGAGGGCTTCAACAGTTATTAGAAAT TTTTAATTCTGGAATTCTAGAGCCTAAAGAGCAGGAATCATGGACTGTGTGGCAG CTAGACTGTCTTGCTTGCTTGCTGAAGTTAATATGCCAGTTTGCAGTAGATCCAT CCGATTTGGATTTAGCTTATCATGATGTCTTTGCCTGGTCTGGTATAGCGGAAAG CCATAGGAAAAGAACCTGGCCTGGCAAATCAAGGAAGGCTGCTGGTGATCATGC TAAGGGTCTTCATATACCACGATTAACAGAGGTATTTCTTGTTCTTGTCCAAGGA ACCAGTTTGATTCAGCGACTTATGTCTGTTGCTTATACGTATGATAATCTGGCTC CTAGAGTTTTAAAAGCTCAGTCTGATCACAGGTCTAGACATGAAGTTTCACATTA TTCAATGTGGCTCTTGGTGAGTTGGGCTCATTGCTGTTCTTTAGTGAAATCTAGCC TTGCTGATAGCGATCATTTACAAGATTGGCTAAAGAAATTGACTCTCCTTATTCC TGAGACTGCAGTTCGTCATGAATCATGCAGTGGTCTCTATAAGTTATCCCTGTCA GGGCTGGATGGAGGAGACTCAATCAATCGTTCTTTTCTGCTATTGGCTGCCTCAA CATTATTGAAATTTCTTCCTGATGCTCAAGCACTCAAACCTATTAGGATAGATGA TTATGAGGAAGAACCAATATTAAAACCAGGATGTAAAGAGTATTTTTGGTTGTTA TGCAAATTAGTTGACAACATACATATAAAGGACGCTAGTCAGACAACGCTCCTC GACTTAGATGCCTTGGCAAGACATTTGGCTGACTGTATTCGAAGTAGGGAGATCC TTGATCATCAGGATGGTAATGTAGAAGATGATGGGCTTACAGGACTCCTAAGGC TTGCAACAAGTGTTGTTAAACACAAACCACCCTTTAAATTTTCAAGGGAAGGAC AGGAATTTTTGAGAGATATCTTCAATCTCCTGTTTTTGTTGCCAAGTCTAAAGGA CCGACAACAGCCAAAGTGCAAATCACATTCTACAAGAGCTGCCGCTTACGATTT GTTAGTAGAGATGGTAAAGGGGTCTGTTGAGAACTACAGGCTAATACACAACTG GGTTATGGCACAACACATGCAGTCCCATGCACCTTATAAATGGGATTACTGGCCT CATGAAGATGTCCGTGCTGAATGTAGATTTGTTGGCCTTACTAACCTTGGAGCTA CTTGTTACTTAGCTTCTACTATTCAGCAACTTTATATGATACCTGAGGCAAGACA GGCTGTCTTCACTGCCAAGTATTCAGAGGATATGAAGCACAAGACCACTCTTCTG GAGCTTCAGAAAATGTTTACATATTTAATGGAGAGTGAATGCAAAGCATATAAT CCTAGACCTTTCTGTAAAACATACACCATGGATAAGCAGCCTCTGAATACTGGGG AACAGAAAGATATGACAGAGTTTTTTACTGATCTAATTACCAAAATCGAAGAAA TGTCTCCCGAACTGAAAAATACCGTCAAAAGTTTATTTGGAGGTGTAATTACAAA CAATGTTGTATCCTTGGATTGTGAACATGTTAGTCAAACTGCTGAAGAGTTTTAT ACTGTGAGGTGCCAAGTGGCTGATATGAAGAACATTTATGAATCTCTTGATGAA GTTACTATAAAAGACACTTTGGAAGGTGATAACATGTATACTTGTTCTCATTGTG GGAAGAAAGTACGAGCTGAAAAAAGGGCATGTTTTAAGAAATTGCCTCGCATTT TGAGTTTCAATACTATGAGATACACATTTAATATGGTCACGATGATGAAAGAGA AAGTGAATACACACTTTTCCTTCCCATTACGTTTGGACATGACGCCCTATACAGA AGATTTTCTTATGGGAAAGAGTGAGAGGAAAGAAGGTTTTAAAGAAGTCAGTGA TCATTCAAAAGACTCAGAGAGCTATGAATATGACTTGATAGGAGTGACTGTTCA CACAGGAACGGCAGATGGTGGACACTATTATAGCTTTATCAGAGATATAGTAAA TCCCCATGCTTATAAAAACAATAAATGGTATCTTTTTAATGATGCTGAGGTAAAA CCTTTTGATTCTGCTCAACTTGCATCTGAATGTTTTGGTGGAGAGATGACGACCA AGACCTATGATTCTGTTACAGATAAATTTATGGACTTCTCTTTTGAAAAGACACA CAGTGCATATATGCTGTTTTACAAACGCATGGAACCAGAGGAAGAAAATGGCAG AGAATACAAATTTGATGTTTCGTCAGAGTTACTAGAGTGGATTTGGCATGATAAC ATGCAGTTTCTTCAAGACAAAAACATTTTTGAACATACATATTTTGGATTTATGT GGCAATTGTGTAGTTGTATTCCCAGTACATTACCAGATCCTAAAGCTGTGTCCTT AATGACAGCAAAGTTAAGCACTTCCTTTGTCCTAGAGACATTTATTCATTCTAAA GAAAAGCCCACGATGCTTCAGTGGATTGAACTGTTGACGAAACAGTTTAATAAT AGTCAGGCAGCTTGTGAGTGGTTTTTAGATCGTATGGCTGATGACGACTGGTGGC CAATGCAGATACTAATTAAGTGCCCTAATCAAATTGTGAGACAGATGTTTCAGCG TTTGTGTATCCATGTGATTCAGAGGCTGAGACCTGTGCATGCTCATCTCTATTTGC AGCCAGGAATGGAAGATGGGTCAGATGATATGGATACCTCAGTAGAAGATATTG GTGGTCGTTCATGTGTCACTCGCTTTGTGAGAACCCTGTTATTAATTATGGAACA TGGTGTAAAACCTCACAGTAAACATCTTACAGAGTATTTTGCCTTCCTTTACGAA TTTGCAAAAATGGGTGAAGAAGAGAGCCAATTTTTGCTTTCATTGCAAGCTATAT CTACAATGGTACATTTTTACATGGGAACAAAAGGACCTGAAAATCCTCAAGTTG AAGTGTTATCAGAGGAAGAAGGGGAAGAAGAAGAGGAGGAAGAAGATATCCTC TCTCTGGCAGAAGAAAAATACAGGCCAGCTGCCCTTGAAAAGATGATAGCTTTA GTTGCTCTTTTGGTTGAACAGTCTCGATCAGAAAGGTGAAATGTTTCGAATTTAA AATGTTTAAAGCATGTTTGGTTTTATTATTTTTACATAATTGTTTACCACTAGTTT TTCCACTAGCTTTTTATTATATATGTTTAATTATGTAATTGTTATTCACTAGCTTTT ATTATATAAATCCTTTTAAATAATACTACTATTCATCAACTCTTGTGGCATAAGA ATTTCAGTTTTTTCTACCAAACTTTTACTTCATCTATGAGTCGTGTTAGAAATAGT CATTGAAAAAATATACAGTAAAATATCTAAAAAAAAAAAAAAAGG.

[5983] The human 23479 sequence (FIG. 33; SEQ ID NO:74) is approximately 3494 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TGA) which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 2805 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:74; SEQ ID NO:76). The coding sequence encodes a 934 amino acid protein (SEQ ID NO:75), which is recited as follows: MSVAYTYDNLAPRVLKAQSDHRSRHEVSHYSMWLLVSWAHCCSLVKSSLADSDHL (SEQ ID NO: 75) QDWLKKLTLLIPETAVRHESCSGLYKLSLSGLDGGDSINRSFLLLAASTLLKFLPDAQ ALKPIRIDDYEEEPILKPGCKEYFWLLCKLVDNIHIKDASQTTLLDLDALARHLADCI RSREILDHQDGNVEDDGLTGLLRLATSVVKHKPPFKFSREGQEFLRDIFNLLFLLPSL KDRQQPKCKSHSTRAAAYDLLVEMVKGSVENYRLIHNWVMAQHMQSHAPYKWD YWPHEDVRAECRFVGLTNLGATCYLASTIQQLYMIPEARQAVFTAKYSEDMKHKTT LLELQKMFTYLMESECKAYNPRPFCKTYTMDKQPLNTGEQKDMTEFFTDLITKIEEM SPELKNTVKSLFGGVITNNVVSLDCEHVSQTAEEFYTVRCQVADMKNIYESLDEVTI KDTLEGDNMYTCSHCGKKVRAEKRACFKKLPRILSFNTMRYTFNMVTMMKEKVNT HFSFPLRLDMTPYTEDFLMGKSERKEGFKEVSDHSKDSESYEYDLIGVTVHTGTADG GHYYSFIRDIVNPHAYKNNKWYLFNDAEVKPFDSAQLASECFGGEMTTKTYDSVTD KFMDFSFEKTHSAYMLFYKRMEPEEENGREYKFDVSSELLEWIWHDNMQFLQDKNI FEHTYFGFMWQLCSCIPSTLPDPKAVSLMTAKLSTSFVLETFIHSKEKPTMLQWIELL TKQFNNSQAACEWFLDRMADDDWWPMQILIKCPNQIVRQMFQRLCIHVIQRLRPVH AHLYLQPGMEDGSDDMDTSVEDIGGRSCVTRFVRTLLLIMEHGVKPHSKHLTEYFA FLYEFAKMGEEESQFLLSLQAISTMVHFYMGTKGPENPQVEVLSEEEGEEEEEEEDIL SLAEEKYRPAALEKMIALVALLVEQSRSER.

[5984] The human 48120 nucleic acid sequence is recited as follows: CCACGCGTCCGGCCTAGTCCTGAGAGGCTGGGCCGGCGGCGGCTGCGGCGGGAG (SEQ ID NO:77) ACCGGTGACCCGCGGCTGGGCGCCTCGGCC ATG ACTGCGGAGCTGCAGCAGGAC GACGCGGCCGGCGCGGCAGACGGCCACGGCTCGAGCTGCCAAATGCTGTTAAAT CAACTGAGAGAAATCACAGGCATTCAGGACCCTTCCTTTCTCCATGAAGCTCTGA AGGCCAGTAATGGTGACATTACTCAGGCAGTCAGCCTTCTCACTGATGAGAGAG TTAAGGAGCCCAGTCAAGACACTGTTGCTACAGAACCATCTGAAGTAGAGGGGA GTGCTGCCAACAAGGAAGTATTAGCAAAAGTTATAGACCTTACTCATGATAACA AAGATGATCTTCAGGCTGCCATTGCTTTGAGTCTACTGGAGTCTCCCAAAATTCA AGCTGATGGAAGAGATCTTAACAGGATGCATGAAGCAACCTCTGCAGAAACTAA ACGCTCAAAGAGAAAACGCTGTGAAGTCTGGGGAGAAAACCCCAATCCCAATG ACTGGAGGAGAGTTGATGGTTGGCCAGTTGGGCTGAAAAATGTTGGCAATACAT GTTGGTTTAGTGCTGTTATTCAGTCTCTCTTTCAATTGCCTGAATTTCGAAGACTT GTTCTCAGTTATAGTCTGCCACAAAATGTACTTGAAAATTGTCGAAGTCATACAG AAAAGAGAAATATCATGTTTATGCAAGAGCTTCAGTATTTGTTTGCTCTAATGAT GGGATCAAATAGAAAATTTGTAGACCCGTCTGCAGCCCTGGATCTATTAAAGGG AGCATTCCGATCATCTGAGGAACAGCAGCAAGATGTGAGTGAATTCACACACAA GCTCCTGGATTGGCTAGAGGACGCATTCCAGCTAGCTGTTAATGTTAACAGTCCC AGGAACAAATCTGAAAATCCAATGGTGCAGCTGTTCTATGGTACTTTCCTGACTG AAGGGGTTCGTGAAGGAAAACCCTTTTGTAACAATGAGACCTTCGGCCAGTATC CTCTTCAGGTAAACGGTTATCGCAACTTAGACGAGTGTTTGGAAGGGGCCATGGT GGAGGGTGATGTTGAGCTTCTTCCCTCCGATCACTCGGTGAAGTATGGACAAGA GCGTTGGTTTACAAAGCTACCTCCAGTGTTGACCTTTGAACTCTCAAGATTTGAG TTTAATCAGTCCCTTGGGCAGCCAGAGAAAATTCACAATAAGCTGGAATTTCCTC AGATTATTTATATGGACAGGTACATGTACAGGAGCAAGGAGCTTATTCGAAATA AGAGAGAGTGTATTCGAAAGTTGAAGGAGGAAATAAAAATTCTGCAGCAAAAA TTGGAAAGGTATGTGAAATATGGCTCAGGCCCAGCTCGGTTCCCGCTCCCGGAC ATGCTGAAATATGTTATTGAATTTGCTAGTACAAAACCTGCCTCAGAAAGCTGTC CACCTGAAAGTGACACACATATGACATTACCACTTTCTTCAGTGCACTGCTCGGT TTCTGACCAGACATCCAAGGAAAGTACAAGTACAGAAAGCTCTTCTCAGGATGT TGAAAGTACCTTTTCTTCTCCTGAAGATTCTTTACCCAAGTCTAAACCACTGACAT CTTCTCGGTCTTCCATGGAAATGCCTTCACAGCCAGCTCCACGAACAGTCACAGA TGAGGAGATAAATTTTGTTAAGACCTGTCTTCAGAGATGGAGGAGTGAGATTGA ACAAGATATACAAGATTTAAAGACTTGTATTGCAAGTACTACTCAGACTATTGAA CAGATGTACTGCGATCCTCTCCTTCGTCAGGTGCCTTATCGCTTGCATGCAGTTCT TGTTCATGAAGGACAAGCAAATGCTGGACACTATTGGGCCTATATCTATAATCAA CCCCGACAGAGCTGGCTCAAGTACAATGACATCTCTGTTACTGAATCTTCCTGGG AAGAAGTTGAAAGAGATTCCTATGGAGGCCTGAGAAATGTTAGTGCTTACTGTC TGATGTACATTAATGACAAACTACCCTACTTCAATGCAGAGGCAGCCCCAACTG AATCAGATCAAATGTCAGAAGTGGAAGCCCTATCTGTGGAACTCAAGCATTACA TTCAGGAGGATAACTGGCGGTTTGAGCAGGAAGTAGAGGAGTGGGAAGAAGAG CAGTCTTGCAAAATCCCTCAAATGGAGTCCTCCACCAACTCCTCATCACAGGACT ACTCTACATCACAAGAGCCTTCAGTAGCCTCTTCTCATGGGGTTCGCTGCTTGTC GTCTGAGCATGCTGTGATTGTAAAGGAGCAAACTGCCCAGGCTATTGCAAACAC AGCCCGTGCCTATGAGAAGAGCGGTGTAGAAGCGGCACTGAGTGAGGTTAAAGA AGCTGAACCCAAGAAGCCCATGCCCCAGGAAACAAACCTTGCAGAGCAGTCAG AACAGCCCCCAAAGGCTAATGATGCAGAGTCTACTGCCCAGCCTAATTCTGAGG TCTCTGAAGTCGAGATTCCCAGTGTGGGAAGGATTCTGGTTAGATCTGATGCAGA TGGATATGATGAGGAGGTGATGCTGAGCCCTGCCATGCAAGGGGTCATCCTGGC CATAGCTAAAGCCCGTCAGACCTTTGACCGAGATGGGTCTGAAGCAGGGCTGAT TAAGGCATTCCATGAAGAATACTCCAGGCTCTATCAGCTTGCCAAAGAGACCCC CACCTCTCACAGTGATCCTCGACTTCAGCATGTCCTTGTCTACTTTTTCCAAAATG AAGCACCCAAAAGGGTAGTAGAACGAACCCTTCTGGAACAGTTTGCAGATAAAA ATCTTAGCTATGATGAAAGATCAATCAGCATTATGAAGGTGGCTCAAGCGAAAC TGAAGGAAATTGGTCCAGATGACATGAATATGGAAGAGTACAAGAAGTGGCATG AAGATTATAGTTTGTTCCGAAAAGTGTCTGTGTATCTCCTAACAGGCCTAGAACT CTATCAAAAAGGAAAGTACCAAGAGGCACTTTCCTACCTGGTATATGCCTACCA GAGCAATGCTGCCCTGCTGATGAAGGGGCCCCGCCGGGGGGTCAAAGAATCCGT GATTGCTTTATACCGAAGAAAATGCCTTCTGGAGCTGAATGCCAAAGCAGCTTCT CTTTTTGAAACAAATGATGATCACTCCGTAACTGAGGGCATTAATGTGATGAATG AACTGATCATCCCCTGCATTCACCTTATCATTAATAATGACATTTCCAAGGATGA TCTGGATGCCATTGAGGTCATGAGAAACCATTGGTGCTCTTACCTTGGGCAAGAT ATTGCAGAAAATCTGCAGCTGTGCCTAGGGGAGTTTCTACCCAGACTTCTAGATC CTTCTGCAGAAATCATCGTCTTGAAAGAGCCTCCAACTATTCGACCCAATTCTCC CTATGACCTATGTAGCCGATTTGCAGCTGTCATGGAGTCAATTCAGGGAGTTTCA ACTGTGACAGTGAAA TAA GCTCCCACATGTTCAAGGCCCATTCTGGTTCCTGGCT GCCTGCCTCTTGCACAGAAGTTCGTTGTCATAGTGCTCACCTTGGGAAAAGGATT AGGTGGGCACATAAGATTCCGATCAGACCCCAACCATGCTGCATGTGTAAAGAA GGATTGAAAATAAAATTGCACTTTTTAGGTACAAAATCATAAAAGCTGTTTCACT AGAAAAGGCAGAAAGCAGTGTATTAAGGTGTTGAATTACGCCAGAAGACCTGAA ATGCCTTGTACCTACAACAATGCTTAGGCTTTTCTAAGCCTCTTGCCACTTTTAAA ATTATCCTTCAGGCATAAATATTTTTGACAGCAGAATAGAAGAATGATTCATGAG AACCTGAACCAGATGAACAGCTACTAGTTATTTTATCAAATACAGATGACATTTA AAAATTCTTAACTACAAGAGATTAGAAATATAAACCTTGCCTGGCTCTTGCCAGG AGATAACAAAATGGGTTGCTGATGAACTGCACCCTTTTACATGTGGGTAGAATAT AAGCTCACATGGCAGTGAGATGTTGAAAAGTCAAAAGAGACCTGTCTCTCTCCTT TCTTTTCTATCTTTAAACCAGAAAACCTCATACTCAGTCCTCAGTGAAAGAAAGT AAAGTATTAAGGACTTTAGGCAGAAGAGCATTGTGTAACTTGACTGAAGATCAT CCATTAATAGTTATTAGGCATTTAGGTAAAATTTTCTAATACCTAAAAATTGTCA AAAACAGTCAATAGGGCTACTGCTGGCCCAAAGACCATTTAGGTCCACCTCCTCT TTTTTGCTCTTTTTTTTTTTTCTGTGACAGTTTCACTGTGTCGCCCAGGCTGGCGTT CAGTGGTGCAATCTCAGCTCACTGCAAACTCTGTCTCCTGGGCTCAAGTGATTCT CGTGCCTCAGCCTCCCGAATAGCTGGAATTACGGGCATGCACCACCACACCTGG CTAATTTTTGTATTTTTAATAGAGATGGGGTTTCACCATATTGGCCAGGCTGATCT CTAACTCCTGGCCTCAAGTGATCTATCTGCCTCCCTCAGCCTCCCAAAGTCTGGG ATTGCAGACAAGTCATCGTACCCGGCCTTCTTTTTTGCCCTTAAAAGTAAGGGAT GTGGGTTTGTACAAAAAAAAAAAAAAAAAAAAAAAAAAACCAGCATACATATG CAAAACTATATATATATGTATATGTAGAGAAAAATACTTCCCATTGATCATTTTT AAAAGGCTTCTGATTGGATATTGTGTTTTAACCAAATTTTAAAGATTAATGGAAT CATGAAAGGGAAAAAATTGATACAACTATGCAGATTTTATAAATGTGCAATAAA AGTATTTGTTTTACA.

[5985] The human 48120 sequence (FIG. 35; SEQ ID NO:77) is approximately 4873 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TAA) which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 3420 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:77; SEQ ID NO:79). The coding sequence encodes a 1139 amino acid protein (SEQ ID NO:78), which is recited as follows: MTAELQQDDAAGAADGHGSSCQMLLNQLREITGIQDPSFLHEALKASNGDITQAVS (SEQ ID NO:78) LLTDERVKEPSQDTVATEPSEVEGSAANKEVLAKVIDLTHDNKDDLQAAIALSLLES PKIQADGRDLNRMHEATSAETKRSKRKRCEVWGENPNPNDWRRVDGWPVGLKNV GNTCWFSAVIQSLFQLPEFRRLVLSYSLPQNVLENCRSHTEKRNIMFMQELQYLFAL MMGSNRKFVDPSAALDLLKGAFRSSEEQQQDVSEFTHKLLDWLEDAFQLAVNVNS PRNKSENPMVQLFYGTFLTEGVREGKPFCNNETFGQYPLQVNGYRNLDECLEGAMV EGDVELLPSDHSVKYGQERWFTKLPPVLTFELSRFEFNQSLGQPEKIHNKLEFPQIIY MDRYMYRSKELIRNKRECIRKLKEEIKILQQKLERYVKYGSGPARFPLPDMLKYVIE FASTKPASESCPPESDTHMTLPLSSVHCSVSDQTSKESTSTESSSQDVESTFSSPEDSLP KSKPLTSSRSSMEMPSQPAPRTVTDEEINFVKTCLQRWRSEIEQDIQDLKTCIASTTQT IEQMYCDPLLRQVPYRLHAVLVHEGQANAGHYWAYIYNQPRQSWLKYNDISVTESS WEEVERDSYGGLRNVSAYCLMYINDKLPYFNAEAAPTESDQMSEVEALSVELKHYI QEDNWRFEQEVEEWEEEQSCKIPQMESSTNSSSQDYSTSQEPSVASSHGVRCLSSEH AVIVKEQTAQAIANTARAYEKSGVEAALSEVKEAEPKKPMPQETNLAEQSEQPPKA NDAESTAQPNSEVSEVEIPSVGRILVRSDADGYDEEVMLSPAMQGVILAIAKARQTF DRDGSEAGLIKAFHEEYSRLYQLAKETPTSHSDPRLQHVLVYFFQNEAPKRVVERTL LEQFADKNLSYDERSISIMKVAQAKLKEIGPDDMNMEEYKKWHEDYSLFRKVSVYL LTGLELYQKGKYQEALSYLVYAYQSNAALLMKGPRRGVKESVIALYRRKCLLELNA KAASLFETNDDHSVTEGINVMNELIIPCIHLIINNDISKDDLDAIEVMRNHWCSYLGQ DIAENLQLCLGEFLPRLLDPSAEIIVLKEPPTIRPNSPYDLCSRFAAVMESIQGVSTVTV K.

[5986] The human 46689 nucleic acid sequence is recited as follows: CACTAGTAACGCCGCCATGTGCTGGAATTCGCCCTTCTCGGGAAGCGCGCCATTG (SEQ ID NO:80) TGTTGGTACCCGGGAATTCGCGGCCGCGTCGACGCCCGCCGGGGCTCTCCAGCTT CGCC ATG CCGCCGTGGGGCGCCGCCCTCGCGCTCATCTTGGCCGTGCTCGCCCTT CTCGGCCTGCTCGGCCCGCGGCTCCGGGGACCCTGGGGGCGCGCCGTCGGAGAG AGGACCCTGCCGGGGGCCCAAGACCGAGACGACGGGGAGGAGGCGGACGGCGG AGGCCCGGCGGACCAGTTCAGCGACGGGCGCGAGCCACTGCCGGGAGGGTGCA GCCTTGTTTGCAAGCCGTCGGCCCTGGCCCAGTGCCTGCTGCGCGCCCTGCGGCG CTCAGAGGCGCTGGAGGCCGGCCCGCGCTCCTGGTTCTCCGGGCCCCACCTGCA GACCCTCTGCCACTTCGTCCTGCCCGTAGCGCCTGGGCCTGAGCTGGCCCGGGAG TACCTGCAGTTGGCGGACGATGGGCTAGTGGCCCTGGACTGGGTGGTAGGACCT TGTGTTCGGGGCCGCCGGATCACCAGCGCCGGGGGCCTTCCTGCGGTGCTTCTGG TGATCCCCAATGCGTGGGGTCGCCTCACCCGCAACGTGCTCGGCCTTTGCTTGCT CGCCCTGGAGCGCGGCTACTACCCGGTCATCTTCCATCGCCGCGGCCACCACGGT TGCCCACTGGTCAGCCCCCGGCTGCAGCCTTTCGGGGACCCGTCCGACCTCAAGG AGGCGGTCACATACATCCGCTTCCGACACCCGGCGGCGCCGCTGTTCGCGGTGA GCGAAGGCTCGGGCTCGGCGCTGCTCCTGTCCTACCTGGGCGAGTGCGGCTCCTC CAGCTACGTGACAGGCGCCGCCTGCATCTCGCCCGTGCTGCGCTGCCGAGAGTG GTTCGAGGCCGGCCTGCCCTGGCCCTACGAGCGGGGCTTTCTGCTCCACCAGAA GATCGCCCTCAGCAGGTATGCCACAGCCCTGGAGGACACTGTGGACACCAGCAG ACTGTTCAGGAGCCGTTCCCTTCGAGAGTTTGAGGAGGCTCTCTTCTGCCACACC AAAAGCTTCCCCATCAGCTGGGATGCCTACTGGGACCGCAACGACCCGCTCCGG GATGTCGATGAGGCAGCCGTGCCTGTGCTGTGTATCTGCAGTGCTGACGACCCCG TGTGTGGACCCCCAGACCACACTCTGACAACTGAACTCTTCCACAGCAACCCCTA CTTCTTCCTCCTGCTCAGTCGCCACGGAGGCCACTGTGGCTTCCTGCGCCAGGAG CCCTTGCCAGCCTGGAGCCATGAGGTCATCTTGGAGTCCTTCCGGGCCTTGACTG AGTTCTTCCGAACGGAGGAGAGGATTAAAGGGCTGAGCAGGCACAGAGCTTCCT TCCTTGGGGGCCGTCGTCGTGGGGGAGCCTTGCAGAGGCGGGAAGTCTCTTCCTC TTCCAACCTGGAGGAGATCTTTAACTGGAAGCGATCATACACAAGG TGA GAGAC CTGGCCTGAGAACCCCCAAGTCCTGCAAAGAAAAACAGAGCTGGGCAAGGGGG AGTCCTGGAAAGATGGGGCGGACTGAACAGAGGGAGCTCCAGCTCTGTGCTCCT CATTCAGTCCCTCTCTCTTAAATTGGTGCCTTGAAAGAGAAGGAACGTCCTGCGA GCCTGCACTCACTTCATCCTCAGCAGAACTCCTGCCTGGCCTCTGCTCAACATAT CCCTACTCATCCGGTCAGCAGCGGCGCGTTCCAGTCACTGTCACCTGTCACTGAC ATCACAAGCCAAAGGATAGCACTTTTTCAATCCATGGACTCAGGAGAAAATGCC CTCTTACTGGCAGTGGCTAGAGGGATGAGACGTTTGTGTATGTCACTGGGCAGTG ACCCCGATTCTCAAGCTGGAGCCATTTGATGTCATGAGGACAGGATGTTTGTGTC TCGGCCCCACTTCCCTCATTTGCTCTGTGGTTGTGGCGCCCTGCTTTGACCGAATG CTCTGGCAACTGCGGCAGCAGGCTTGTGTGTGTGAGAAGGGCGGCAGAGGCAGT GGGGCTGGCT.

[5987] The human 46689 sequence (FIG. 37; SEQ ID NO:80), which is approximately 2082 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TGA) that are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 1407 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:80; SEQ BD NO:82). The coding sequence encodes a 468 amino acid protein (SEQ ID NO:81), which is recited as follows: MPPWGAALALILAVLALLGLLGPRLRGPWGRAVGERTLPGAQDRDDGEEADGGGP (SEQ ID NO:81) ADQFSDGREPLPGGCSLVCKPSALAQCLLRALRRSEALEAGPRSWFSGPHLQTLCHF VLPVAPGPELAREYLQLADDGLVALDWVVGPCVRGRRITSAGGLPAVLLVIPNAWG RLTRNVLGLCLLALERGYYPVIFHRRGHHGCPLVSPRLQPFGDPSDLKEAVTYIRFRH PAAPLFAVSEGSGSALLLSYLGECGSSSYVTGAACISPVLRCREWFEAGLPWPYERG FLLHQKIALSRYATALEDTVDTSRLFRSRSLREFEEALFCHTKSFPISWDAYWDRNDP LRDVDEAAVPVLCICSADDPVCGPPDHTLTTELFHSNPYFFLLLSRHGGHCGFLRQEP LPAWSHEVILESFRALTEFFRTEERIKGLSRHRASFLGGRRRGGALQRREVSSSSNLEE IFNWKRSYTR.

Example 49 Tissue Distribution of 23479, 48120, or 46689 mRNA by TaqMan Analysis

[5988] Endogenous human 23479, 48120, or 46689 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′ end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[5989] To determine the level of 23479 and 48120 in various human tissues, primer/probe sets were designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 μg total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Tables 19 and 20. 23479 mRNA was detected in coronary SMC, HUVEC, normal brain cortex, brain hypothalamus, lung tumor, and erythroid cells (Table 19). 48120 expression was found in heart, heart CHF, normal brain cortex, and lung tumor (Table 20).

[5990] To determine the level of 46689 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 μg total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Tables 18-24. 46689 mRNA was detected in hematopoietic cells (e.g., bone-marrow mononuclear cells), lung, thymus, glial cells, kidney, and the brain (e.g., the cortex), as well as in numerous tumors from the lung, brain, breast, and ovary. TABLE 19 Expression of 23479 mRNA Relative Tissue Expression Artery normal 136.3135 Aorta diseased 23.0355 Vein normal 6.7776 Coronary Smooth Muscle Cells (SMC) 219.1514 Human Umbilical Vein Endothelial Cells (HUVEC) 829.3195 Hemangioma 36.3979 Heart normal 102.9489 Heart Congestive Heart Failure 89.6222 Kidney 122.8526 Skeletal Muscle 170.755 Adipose normal 5.4861 Pancreas 37.6815 primary osteoblasts 44.1942 Osteoclasts (differentiated) 0 Skin normal 4.3796 Spinal cord normal 37.4212 Brain Cortex normal 675.9554 Brain Hypothalamus normal 219.9123 Nerve 32.1286 Dorsal Root Ganglion 54.7879 Breast normal 29.2585 Breast tumor 69.8304 Ovary normal 88.3883 Ovary Tumor 1.0761 Prostate Normal 6.7308 Prostate Tumor 54.7879 Salivary glands 3.2848 Colon normal 0.7689 Colon Tumor 27.3939 Lung normal 0.5929 Lung tumor 184.2837 Lung Chronic Obstructive Pulmonary Disease 1.3433 Colon Inflammatory Bowel Disease 0.2358 Liver normal 2.015 Liver fibrosis 11.2028 Spleen normal 2.7147 Tonsil normal 14.18 Lymph node normal 10.0616 Small intestine normal 3.0121 Skin-Decubitus 8.6086 Synovium 0.7026 Bone Marrow, Mononuclear Cells 6.9441 Activated peripheral blood mononuclear cells 2.7241 Neutrophils 16.9215 Megakaryocytes 89.9333 Erythroid 424.8425

[5991] Expression of human 23479 mRNA was detected in many tissues. Prominent expression was detected in cardiovascular tissues (e.g., arteries, smooth muscle cells, endothelial cells, and the heart), kidney, skeletal muscle, brain (e.g., the cortex and hypothalamus), ovary, and blood cells (e.g., megakaryocytes and erythroid cells). Significantly, expression of human 23479 was elevated in lung, breast, prostate, and colon tumors, as compared to its expression in the appropriate normal tissues. In contrast, expression of human 23479 was decreased in ovary tumors as compared to its expression in normal ovary tissue. TABLE 20 Expression of 48120 mRNA Relative Tissue Expression Artery normal 5.9826 Aorta diseased 0.6858 Vein normal 0.398 Coronary Smooth Muscle Cells 8.6986 Human Umbilical Vein Endothelial Cells 13.9848 Hemangioma 2.3388 Heart normal 34.5541 Heart congestive heart failure 104.0248 Kidney 3.9334 Adipose normal 0 Pancreas 3.3422 primary osteoblasts 0.2025 Osteoclasts (differentiated) 0 Skin normal 0 Spinal cord normal 0.4832 Brain Cortex normal 39.83 Brain Hypothalamus normal 6.8485 Nerve 1.8542 Dorsal Root Ganglion 1.348 Breast normal 1.6142 Breast tumor 1.249 Ovary normal 5.962 Ovary Tumor 0.4832 Prostate Normal 1.9531 Prostate Tumor 4.4716 Salivary glands 0.231 Colon normal 0.3133 Colon Tumor 5.2082 Lung normal 0.1567 Lung tumor 49.8936 Lung Chronic Obstructive Pulmonary Disease 0.2247 Colon Inflammatory Bowel Disease 0.06 Liver normal 0.3898 Liver fibrosis 1.334 Spleen normal 1.5592 Tonsil normal 1.8931 Lymph node normal 1.6367 Small intestine normal 0.4325 Macrophages 0.0183 Synovium 0 Bone Marrow, Mononuclear Cells 0.0363 Activated peripheral blood mononuclear cells 0.0754 Neutrophils 0.5628 Megakaryocytes 0.0407 Erythroid 0.5325 positive control 59.1286 Skeletal Muscle 21.8685

[5992] Expression of human 48120 mRNA was detected in many tissues, including cardiovascular tissues (e.g., arteries, smooth muscle cells, endothelial cells, and the heart), kidney, pancreas, skeletal muscle, brain (e.g., the cortex and hypothalamus), breast, ovary, prostate, spleen, tonsil, and lymph node tissue. Significantly, expression of human 48120 was highly elevated in heart tissue from a patient that suffered from congestive heart failure. In addition, expression of human 48120 was elevated in lung, colon and prostate tumors, as compared to its expression in the appropriate normal tissues. In contrast, expression of human 48120 was decreased in ovary tumors as compared to its expression in normal ovary tissue. TABLE 21 Expression of 46689 mRNA Relative Tissue Expression Artery normal 0 Vein normal 0 Aortic SMC EARLY 6.23 Aortic SMC LATE 5.24 Static HUVEC 7.13 Shear HUVEC 7.88 Heart normal 0.94 Heart CHF 1.33 Kidney 10.4 Skeletal Muscle 1.19 Adipose normal 7.46 Pancreas 4.67 Primary Osteoblasts 2 Osteoclasts (diff) 0.16 Skin normal 2.69 Spinal cord normal 1.18 Brain Cortex normal 8.07 Brain Hypothalamus normal 3.97 Nerve 3.44 DRG (Dorsal Root Ganglion) 3.31 Glial Cells (Astrocytes) 12.8 Glioblastoma 2.65 Breast normal 6.51 Breast tumor 104.57 Ovary normal 0.23 Ovary Tumor 8.84 Prostate Normal 5.04 Prostate Tumor 0.47 Epithelial Cells (Prostate) 0.36 Colon normal 0.09 Colon Tumor 2.83 Lung normal 40.18 Lung tumor 16.89 Lung COPD 81.76 Colon IBD 2.45 Liver normal 2.26 Liver fibrosis 4.01 Dermal Cells- fibroblasts 6.25 Spleen normal 6.4 Tonsil normal 3.68 Lymph node 0.03 Thymus normal 22.6 Skin-Decubitus 9.57 Synovium (rheumatoid Arthritis) 49.63 BM-MNC 47.94 Activated PBMC 0.63

[5993] Expression of human 46689 mRNA was detected in bone marrow mononuclear cells (BM-MNC), lung, thymus, glial cells, kidney, brain cortex, human umbilical vein endothelial cells (HUVECs), aortic smooth muscle cells (SMCs), heart, skeletal muscle, adipose tissue, pancreas, osteoblasts and osteoclasts, skin, spinal chord, hypothalamus, dorsal root ganglia (DRG) and other nerve cells, breast, ovary, prostate, prostate epithelial cells, colon, liver, dermal fibroblasts, spleen, tonsil, lymph nodes, and activated pre bone marrow cells. In addition, human 46689 is expressed in tissues, cells, or fluids associated with disease, including a breast tumor, lung tissue from a patient with chronic obstructive pulmonary disease, sinovium from a patient with rheumatoid arthritis, a lung tumor, decubitus skin, an ovary tumor, liver tissue from a patient with liver fibrosis, a colon tumor, colon tissue form a patient with inflammatory bowel disease (IBD), a prostate tumor, and heart tissue from a patient that had congestive heart failure. Importantly, 46689 expression was elevated in the breast, ovary, and colon tumors as compared to the appropriate normal tissue. TABLE 22 Expression of 46689 mRNA Relative Tissue Expression Hemangioma 0.0 Hemangioma 0.9 Hemangioma 0.5 Normal Kidney 4.8 Renal Cell Carcinoma 0.3 Wilms Tumor 6.4 Wilms Tumor 4.2 Skin 0.0 Uterine Adenocarcinoma 1.2 Neuroblastoma 1.2 Fetal Adrenal 0.4 Fetal Kidney 4.8 Fetal Heart 1.4 Normal Heart 2.1 Cartilage 1.9 Spinal cord 4.2 lymphangioma 2.8 Endometrial polyps 0.0 Synovium (RA) 0.4 Hyperkeratotic skin 1.8

[5994] Expression of 46689 was detected in normal kidney, fetal kidney, fetal adrenal gland, normal heart, fetal heart, cartilage, and spinal chord tissues. In addition, expression of 46689 was observed in several samples associated with disease, including Wilms tumors, a lymphangioma, hyoperkaratotic skin, a uterine adenocarcinoma, a neuroblastoma, hemangiomas, synovial fluid from a patient with rheumatoid arthritis (RA), and a renal cell carcinoma. TABLE 23 Expression of 46689 mRNA Relative Tissue Expression PIT 400 Breast N 0.49 PIT 56 Breast N 7.34 MDA 106 Breast T 1.17 MDA 234 Breast T 0.46 NDR 57 Breast T 1.02 MDA 304 Breast T 0.85 NDR 58 Breast T 2.84 NDR 132 Breast T 6.99 NDR 07 Breast T 0.33 NDR 12 Breast T 13.00 PIT 208 Ovary N 1.81 CHT 620 Ovary N 1.05 CHT 619 Ovary N 1.79 CLN 03 Ovary T 1.24 CLN 05 Ovary T 3.57 CLN 17 Ovary T 2.78 CLN 07 Ovary T 0.43 CLN 08 Ovary T 0.30 MDA 216 Ovary T 0.52 CLN 012 Ovary T 4.63 MDA 25 Ovary T 5.86 MDA 183 Lung N 0.08 CLN 930 Lung N 0.59 MDA 185 Lung N 0.51 CHT 816 Lung N 0.21 MPI 215 Lung T--SmC 3.51 MDA 259 Lung T-PDNSCCL 2.14 CHT 832 Lung T-PDNSCCL 5.05 MDA 253 Lung T-PDNSCCL 3.96 CHT 814 Lung T-SCC 5.90 CHT 911 Lung T-SCC 16.01 CHT 726 Lung T-SCC 0.91 CHT 845 Lung T-AC 11.01

[5995] Expression of human 46689 was detected in normal breast, ovary, and lung tissue samples, as well as in breast, ovary, and lung tumors. Significantly, expression of human 46689 mRNA was elevated in 8/8 lung tumors analyzed, 4/7 ovary tumors analyzed, and 1/8 breast tumors analyzed, as compared to normal lung, ovary, and breast tissue samples, respectively. This indicates that human 46689 is a marker of tumor formation and/or growth, especially in the lung. Abbreviation used: N, normal; T, tumor; SmC, small cell carcinoma; PDNSCCL, poorly differentiated non-small cell carcinoma of the lung; SCC, squamous cell carcinoma; AC, adenocarcinoma. TABLE 24 Expression of 46689 mRNA Relative Tissue Expression Colon N 0.33 Colon N 4.51 Colon N 2.66 Colon N 23.63 Colon T 0.30 Colon T 0.52 Colon T 6.18 Colon T 1.88 Colon T 0.20 Liver Met 0.20 Liver Met 0.14 Liver Met 0.32 Liver Met 0.18 Liver Nor 0.10 Liver Nor 0.27 Brain N 0.04 Brain N 0.05 Brain N 0.11 Brain N 0.05 Astrocytes 0.74 Brain T 1.85 Brain T 0.91 Brain T 0.35 Brain T 0.30 HMVEC-Arrested 0.13 HMVEC-Proliferating 0.17 Placenta 0.49 Fetal Adrenal 0.14 Fetal Adrenal 0.00 Fetal Liver 0.11 Fetal Liver 2.85 Wilms T 0.07 Renal T 0.00 Endometrial AC 0.13

[5996] Expression of 46689 mRNA was detected in normal colon, liver, and brain tissue, as well as in human vascular endothelial cells (HMVEC), fetal adrenal gland, and fetal liver. Expression of 46689 was also detected in colon tumors, liver metastases, brain tumors, a wilms tumor, a renal tumor, and an endometrial adenocarcinoma (AC). Significantly, 4/4 brain tumors analyzed displayed elevated 46689 mRNA expression as compared to normal brain tissue, again suggesting that human 46689 is a marker for tumor formation and/or growth in some tissue. Abbreviations used: N, normal; T, tumor; Met, metastasis. TABLE 25 Expression of 46689 mRNA Relative Tissue Expression PIT 337 Colon N 13.09 CHT 410 Colon N 4.32 CHT 425 Colon N 4.46 CHT 371 Colon N 4.02 PIT 281 Colon N 7.21 NDR 211 Colon N 4.50 CHT 122 Adenomas 6.50 CHT 887 Adenomas 21.64 CHT 414 Colonic ACA-B 9.29 CHT 841 Colonic ACA-B 8.00 CHT 890 Colonic ACA-B 3.09 CHT 910 Colonic ACA-B 3.34 CHT 807 Colonic ACA-B 3.75 CHT 382 Colonic ACA-B 3.92 CHT 377 Colonic ACA-B 3.04 CHT 520 Colonic ACA-C 4.58 CHT 596 Colonic ACA-C 4.04 CHT 907 Colonic ACA-C 6.97 CHT 372 Colonic ACA-C 11.05 NDR 210 Colonic ACA-C 9.26 CHT 1365 Colonic ACA-C 6.75 CLN 740 Liver N 26.83 CLN 741 Liver N 26.64 NDR 165 Liver N 6.11 NDR 150 Liver N 20.05 PIT 236 Liver N 5.68 CHT 1878 Liver N 21.27 CHT 077 Colon to Liver Met 7.73 CHT 119 Colon to Liver Met 21.64 CHT 131 Colon to Liver Met 18.07 CHT 218 Colon to Liver Met 14.73 CHT 739 Colon to Liver Met 14.63 CHT 755 Colon to Liver Met 18.52 CHT 215 Col Abdominal Met 2.57

[5997] Expression of 46689 mRNA was detected in both normal colon and liver tissues, as well as in colon tumors and liver metastases. Of the colon tumors, 1/15 displayed an increase in 46689 expression relative to normal colon tissue. Abbreviations used: N, normal; ACA, adenocarcinoma; Met, metastasis. TABLE 26 Expression of 46689 mRNA Relative Cell line Expression MCF-7 Breast T 53.47 ZR75 Breast T 44.66 T47D Breast T 49.38 MDA 231 Breast T 19.71 MDA 435 Breast T 7.81 SKBr3 Breast 19.78 DLD 1 ColonT (stageC) 113.83 HCT116 Colon T 31.58 HT29 Colon T 42.25 Colo 205 Colon T 15.68 NCIH125 Lung T 8.34 A549 Lung T 55.36 NHBE Lung 33.49 SKOV-3 ovary 94.08 OVCAR-3 ovary 10.82 293 Baby Kidney 6.71 293T Baby Kidney 19.92

[5998] Expression of 46689 mRNA in cell lines that are suitable for transplantation into mice as part of a xenografting experiment. All cell lines displayed high expression of human 46689, with the exception of MDA 435 Breast tumor cells, NCIH 125 Lung tumor cells, and 293 Baby Kidney cells. Expression of 46689 mRNA is particularly high in stage 3 DLD1 Colon tumor cells and SKOV-3 Ovary cells. TABLE 27 Expression of 46689 mRNA Relative Tissue Expression RIP Angio 9.0366 RIP Tumor 13.9848 Xeno Parent 1 3.6955 Xeno Parent 2 0.0566 Xeno VEGF 1 0.1763 Xeno VEGF 2 0.251 Spleen 3.0754 Heart 0.4078 Kidney 0.2536 Liver 3.3422 VEGF 1 0.862 VEGF 2 1.3294 P1 0.4431 P2 0.2853

[5999] Cells expressing human 46689 were transplanted into mice and allowed to form tumors in vivo. The tumor cells then isolated and analyzed for 46689 expression. RIP Angio shows 46689 expression in cells that were removed just after angiogenesis had begun to provide blood vessels to the tumor, while RIP Tumor shows 46689 expression in cells that were removed after angiogenesis had contributed to further tumor growth. Angiogenesis-dependent tumor growth was correlated with an increase in 46689 expression. The Xeno VEGF1, Xeno VEGF2, VEGF1, and VEGF2 values correspond to 46689 expression levels for cells transplanted into mice that overexpress either VEGF1 or VEGF2, growth factors involved in angiogenesis. The Xeno Parent 1, Xeno Parent 2, P1, and P2 lanes are control mice that do not overexpress either VEGF1 or VEGF2.

Example 50 Tissue Distribution of 23479, 48120, or 46689 mRNA by Northern Analysis

[6000] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2×SSC at 65° C. A DNA probe corresponding to all or a portion of the 23479, 48120, or 46689 cDNA (SEQ ID NO:74) can be used. The DNA was radioactively labeled with ³²P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 51 Recombinant Expression of 23479, 48120, or 46689 in Bacterial Cells

[6001] In this example, 23479, 48120, or 46689 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 23479, 48120, or 46689 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-23479, 48120, or 46689 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 52 Expression of Recombinant 23479, 48120, or 46689 Protein in COS Cells

[6002] To express the 23479, 48120, or 46689 gene in COS cells (e.g., COS-7 cells, CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182), the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 23479, 48120, or 46689 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[6003] To construct the plasmid, the 23479, 48120, or 46689 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 23479, 48120, or 46689 coding sequence starting from the initiation codon; the 3′end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 23479, 48120, or 46689 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 23479, 48120, or 46689 gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5α, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[6004] COS cells are subsequently transfected with the 23479, 48120, or 46689-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 23479, 48120, or 46689 polypeptide is detected by radiolabelling (³⁵S-methionine or ³⁵S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with ³⁵S-methionine (or ³⁵S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[6005] Alternatively, DNA containing the 23479, 48120, or 46689 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 23479, 48120, or 46689 polypeptide is detected by radiolabelling and immunoprecipitation using a 23479, 48120, or 46689 specific monoclonal antibody.

Examples for 80091 Example 53 Identification and Characterization of Human 80091 cDNA

[6006] ATG GTGGCTGATGCCTGTAATCCCAACAGTTTGGGAGACTGGGGAGGAAGATCA (SEQ ID NO:94) TTTGAGGCCAGGAGTTTGAGACCAGCCTGGGCTCAAGCAGTCCTGCCTCAGCCTC CCAAAGTGCTGGGATTACAGATGGGTCATCTTACTCTGGAAGACTATCAGATCTG GAGTGTGAAAAATGTTCTTGCCAATGAGTTTTTGAACCTCCTTTTCCAGGTGTGT CACATAGTTCTGGGGTTAAGACCAGCTACTCCGGAAGAAGAAGGACAAATTATT AGAGGATGGTTAGAACGAGAGAGCAGGTATGGTCTGCAAGCAGGACACAACTG GTTTATCATCTCCATGCAGTGGTGGCAACAGTGGAAAGAATATGTCAAATACGA TGCCAACCCTGTGGTAATTGAGCCATCATCTGTTTTGAATGGAGGAAAATACTCA TTTGGAACTGCAGCCCATCCTATGGAGCAGGTCGAAGATAGAATTGGAAGCAGC CTCAGTTACGTGAATACTACAGAAGAGAAATTTTCAGACAACATTTCTACTGCAT CTGAAGCCTCAGAAACTGCTGGCAGCGGCTTTCTGTATTCTGCCACACCAGGGGC AGATGTTTGCTTTGCTCGACAACATAACACTTCTGACAATAACAACCAGTGTTTG CTGGGAGCCAATGGGAATATTTTGTTGCACCTTAACCCTCAGAAACCAGGGGCT ATTGATAATCAGCCATTAGTAACTCAAGAACCAGTAAAGGCTACATCATTAACA CTAGAAGGAGGACGATTAAAACGAACTCCACAGCTGATTCATGGAAGAGACTAT GAAATGGTCCCAGAACCTGTGTGGAGAGCACTTTATCACTGGTATGGAGCAAAC CTGGCCTTACCTAGACCAGTTATCAAGAACAGCAAGACAGACATCCCAGAGCTG GAATTATTTCCCCGCTATCTTCTCTTCCTGAGACAGCAGCCTGCCACTCGGACAC AGCAGTCTAACATCTGGGTGAATATGGGAAATGTACCTTCTCCGAATGCACCTTT AAAGCGGGTATTAGCCTATACAGGCTGTTTTAGTCGAATGCAGACCATCAAGGA AATTCACGAATATCTATCTCAAAGACTGCGCATTAAAGAGGAAGATATGCGCCT GTGGCTATACAACAGTGAGAACTACCTTACTCTTCTGGATGATGAGGATCATAAA TTGGAATATTTGAAAATCCAGGATGAACAACACCTGGTAATTGAAGTTCGCAAC AAAGATATGAGTTGGCCTGAGGAGATGTCTTTTATAGCAAATAGTAGTAAAATA GATAGACACAAGGTTCCCACAGAAAAGGGAGCCACAGGTCTAAGCAATCTGGG AAACACATGCTTCATGAACTCAAGCATCCAGTGTGTTAGTAACACACAGCCACT GACACAGTATTTTATCTCAGGGAGACATCTTTATGAACTCAACAGGACAAATCCC ATTGGTATGAAGGGGCATATGGCTAAATGCTATGGTGATTTAGTGCAGGAACTTT GGAGTGGAACTCAGAAGAATGTTGCCCCATTAAAGCTTCGGTGGACCATAGCAA AATATGCTCCCAGGTTTAATGGGTTTCAGCAACAGGACTCCCAAGAACTTCTGGC TTTTCTCTTGGATGGTCTTCATGAAGATCTTAATCGAGTCCATGAAAAGCCATAT GTGGAACTGAAGGACAGTGATGGGCGACCAGACTGGGAAGTAGCTGCAGAGGC CTGGGACAACCATCTAAGAAGAAATAGATCAATTGTTGTGGATTTGTTCCATGGG CAGCTAAGATCTCAAGTAAAATGCAAGACATGTGGGCATATAAGTGTCCGATTT GACCCTTTCAATTTTTTGTCTTTGCCACTACCAATGGACAGTTATATGCACTTAGA AATAACAGTGATTAAGTTAGATGGTACTACCCCTGTACGGTATGGACTAAGACT GAATATGGATGAAAAGTACACAGGTTTAAAAAAACAGCTGAGTGATCTCTGTGG ACTTAATTCAGAACAAATCCTTCTAGCAGAAGTACATGGTTCCAACATAAAGAA CTTTCCTCAGGACAACCAAAAAGTACGACTCTCAGTGAGTGGATTTTTGTGTGCA TTTGAAATTCCTGTCCCTGTGTCTCCAATTTCAGCTTCTAGTCCAACACAGACAG ATTTCTCCTCTTCGCCATCTACAAATGAAATGTTCACCCTAACTACCAATGGGGA CCTACCCCGACCAATATTCATCCCCAATGGAATGCCAAACACTGTTGTGCCATGT GGAACTGAGAAGAACTTCACAAATGGAATGGTTAATGGTCACATGCCATCTCTT CCTGACAGCCCCTTTACAGGTTACATCATTGCAGTCCACCGAAAAATGATGAGG ACAGAACTGTATTTCCTGTCATCTCAGAAGAATCGCCCCAGCCTCTTTGGAATGC CATTGATTGTTCCATGTACTGTGCATACCCGGAAGAAAGACCTATATGATGCGGT TTGGATTCAAGTATCCCGGTTAGCGAGCCCACTCCCACCTCAGGAAGCTAGTAAT CATGCCCAGGATTGTGACGACAGTATGGGCTATCAATATCCATTCACTCTACGAG TTGTGCAGAAAGATGGGAACTCCTGTGCTTGGTGCCCATGGTATAGATTTTGCAG AGGCTGTAAAATTGATTGTGGGGAAGACAGAGCTTTCATTGGAAATGCCTATAT CGCTGTGGATTGGGATCCCACAGCCCTTCACCTTCGCTATCAAACATCCCAGGAA AGGGTTGTAGATGAGCATGAGAGTGTGGAGCAGAGTCGGCGAGCGCAAGCCGA GCCCATCAACCTGGACAGCTGTCTCCGTGCTTTCACCAGTGAGGAAGAGCTAGG GGAAAATGAGATGTACTACTGTTCCAAGTGTAAGACCCACTGCTTAGCAACAAA GAAGCTGGATCTCTGGAGGCTTCCACCCATCCTGATTATTCACCTTAAGCGATTT CAATTTGTAAATGGTCGGTGGATAAAATCACAGAAAATTGTCAAATTTCCTCGGG AAAGTTTTGATCCAAGTGCTTTTTTGGTACCAAGAGACCCGGCTCTCTGCCAGCA TAAACCACTCACACCCCAGGGGGATGAGCTCTCTGAGCCCAGGATTCTGGCAAG GGAGGTGAAGAAAGTGGATGCGCAGAGTTCGGCTGGGGAAGAGGACGTGCTCC TGAGCAAAAGCCCATCCTCACTCAGCGCTAACATCATCAGCAGCCCGAAAGGTT CTCCTTCTTCATCAAGAAAAAGTGGAACCAGCTGTCCCTCCAGCAAAAACAGCA GCCCTAATAGCAGCCCACGGACTTTGGGGAGGAGCAAAGGGAGGCTCCGGCTGC CCCAGATTGGCAGCAAAAATAAACTGTCAAGTAGTAAAGAGAACTTGGATGCCA GCAAAGAAAATGGGGCTGGGCAGATATGTGAGCTGGCTGACGCCTTGAGTCGAG GGCATGTGCTGGGGGGCAGCCAACCAGAGTTGGTCACTCCTCAGGACCATGAGG TAGCTTTGGCCAATGGATTCCTTTATGAGCATGAAGCATGTGGCAATGGCTACAG CAATGGTCAGCTTGGAAACCACAGTGAAGAAGACAGCACTGATGACCAAAGAG AAGATACTCGTATTAAGCCTATTTATAATCTATATGCAATTTCGTGCCATTCAGG AATTCTGGGTGGGGGCCATTACGTCACTTATGCCAAAAACCCAAACTGCAAGTG GTACTGTTACAATGACAGCAGCTGTAAGGAACTTCACCCGGATGAAATTGACAC CGACTCTGCCTACATTCTTTTCTATGAGCAGCAGGGGATAGACTATGCACAATTT CTGCCAAAGACTGATGGCAAAAAGATGGCAGACACAAGCAGTATGGATGAAGA CTTTGAGTCTGATTACAAAAAGTACTGTGTGTTACAG TAA .

[6007] The human 80091 sequence (SEQ ID NO:94) is approximately 3954 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TAA) which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence, including the termination codon. The coding sequence encodes a 1317 amino acid protein (SEQ ID NO:95), which is recited as follows: MVADACNPNSLGDWGGRSFEARSLRPAWAQAVLPQPPKVLGLQMGHLTLEDYQI (SEQ ID NO:95) WSVKNVLANEFLNLLFQVCHIVLGLRPATPEEEGQIIRGWLERESRYGLQAGHNWFII SMQWWQQWKEYVKYDANPVVIEPSSVLNGGKYSFGTAAHPMEQVEDRIGSSLSYV NTTEEKFSDNISTASEASETAGSGFLYSATPGADVCFARQHNTSDNNNQCLLGANGN ILLHLNPQKPGAIDNQPLVTQEPVKATSLTLEGGRLKRTPQLIHGRDYEMVPEPVWR ALYHWYGANLALPRPVIKNSKTDIPELELFPRYLLFLRQQPATRTQQSNIWVNMGNV PSPNAPLKRVLAYTGCFSRMQTIKEIHEYLSQRLRIKEEDMRLWLYNSENYLTLLDD EDHKLEYLKIQDEQHLVIEVRNKDMSWPEEMSFIANSSKIDRHKVPTEKGATGLSNL GNTCFMNSSIQCVSNTQPLTQYFISGRHLYELNRTNPIGMKGHMAKCYGDLVQELW SGTQKNVAPLKLRWTIAKYAPRFNGFQQQDSQELLAFLLDGLHEDLNRVHEKPYVE LKDSDGRPDWEVAAEAWDNHLRRNRSIVVDLFHGQLRSQVKCKTCGHISVRFDPFN FLSLPLPMDSYMHLEITVIKLDGTTPVRYGLRLNMDEKYTGLKKQLSDLCGLNSEQI LLAEVHGSNIKNFPQDNQKVRLSVSGFLCAFEIPVPVSPISASSPTQTDFSSSPSTNEMF TLTTNGDLPRPIFIPNGMPNTVVPCGTEKNFTNGMVNGHMPSLPDSPFTGYIIAVHRK MMRTELYFLSSQKNRPSLFGMPLIVPCTVHTRKKDLYDAVWIQVSRLASPLPPQEAS NHAQDCDDSMGYQYPFTLRVVQKDGNSCAWCPWYRFCRGCKIDCGEDRAFIGNA YIAVDWDPTALHLRYQTSQERVVDEHESVEQSRRAQAEPINLDSCLRAFTSEEELGE NEMYYCSKCKTHCLATKKLDLWRLPPILIIHLKRFQFVNGRWIKSQKIVKFPRESFDP SAFLVPRDPALCQHKPLTPQGDELSEPRILAREVKKVDAQSSAGEEDVLLSKSPSSLS ANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRSKGRLRLPQIGSKNKLSSSKEN LDASKENGAGQICELADALSRGHVLGGSQPELVTPQDHEVALANGFLYEHEACGNG YSNGQLGNHSEEDSTDDQREDTRIKPIYNLYAISCHSGILGGGHYVTYAKNPNCKWY CYNDSSCKELHPDEIDTDSAYILFYEQQGIDYAQFLPKTDGKKMADTSSMDEDFESD YKKYCVLQ.

Example 54 Tissue Distribution of 80091 mRNA by TaqMan Analysis

[6008] Endogenous human 80091 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[6009] To determine the level of 80091 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 μg total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Table 28. The highest levels 80091 of mRNA expression were observed in erythroid cells, followed by brain cortex and HUVEC. TABLE 28 Expression of 80091 Relative Tissue Type Expression Artery normal 38.6068 Aorta diseased 10.5253 Vein normal 4.1433 Coronary SMC 48.5293 HUVEC 122.4275 Hemangioma 8.5492 Heart normal 31.4674 Heart CHF 32.5771 Kidney 35.4026 Skeletal Muscle 60.1622 Adipose normal 1.5646 Pancreas 6.2584 Primary osteoblasts 27.9695 Osteoclasts (diff) 0.2302 Skin normal 11.4382 Spinal cord normal 29.5643 Brain Cortex normal 219.9123 Brain Hypothalamus normal 59.1286 Nerve 30.3955 DRG (Dorsal Root Ganglion) 60.7909 Breast normal 11.5978 Breast tumor 10.8587 Ovary normal 11.9239 Ovary Tumor 2.2436 Prostate Normal 0.9017 Prostate Tumor 10.8587 Salivary glands 0.4556 Colon normal 0.509 Colon Tumor 7.7049 Lung normal 1.1374 Lung tumor 22.5614 Lung COPD 0.98 Colon IBD 0.1187 Liver normal 6.6843 Liver fibrosis 17.4576 Spleen normal 1.6595 Tonsil normal 1.0358 Lymph node normal 2.5154 Small intestine normal 1.0987 Skin-Decubitus 5.5435 Synovium 0.8298 BM-MNC 21.6423 Activated PBMC 1.8414 Neutrophils 10.8212 Megakaryocytes 38.0753 Erythroid 461.6912

Example 55 Tissue Distribution of 80091 mRNA by Northern Analysis

[6010] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2×SSC at 65° C. A DNA probe corresponding to all or a portion of the 80091 cDNA (SEQ ID NO:94) can be used. The DNA was radioactively labeled with ³²P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 56 Recombinant Expression of 80091 in Bacterial Cells

[6011] In this example, 80091 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 80091 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-80091 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 57 Expression of Recombinant 80091 Protein in COS Cells

[6012] To express the 80091 gene in COS cells (e.g., COS-7 cells, CV-1 origin SV40 cells; Gluzman (1981) Cell 123: 175-182), the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 80091 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[6013] To construct the plasmid, the 80091 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 80091 coding sequence starting from the initiation codon; the 3′ end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 80091 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 80091_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5α, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[6014] COS cells are subsequently transfected with the 80091-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 80091 polypeptide is detected by radiolabelling (³⁵S-methionine or ³⁵S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with ³⁵S-methionine (or ³⁵S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[6015] Alternatively, DNA containing the 80091 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 80091 polypeptide is detected by radiolabelling and immunoprecipitation using an 80091 specific monoclonal antibody.

Examples for 46508 Example 58 Identification and Characterization of Human 46508 cDNA

[6016] The human 46508 sequence (FIG. 42; SEQ ID NO:101), which is approximately 1180 nucleotides long, including untranslated regions, contains a predicted methionine-initiated coding sequence of about 684 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:101; SEQ ID NO:103). The coding sequence encodes a 711 amino acid protein (SEQ ID NO: 102).

Example 59 Tissue Distribution of 46508 mRNA by TaqMan Analysis

[6017] Endogenous human 46508 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[6018] To determine the level of 46508 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 μg total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Tables 29, 30 and 31.

[6019] Table 29 below shows expression of 46508 mRNA in various normal and diseased tissues, detected using TaqMan analysis. The highest transcriptional expression of 46508 was noted in HUVEC cell line, with moderate to high expression found in normal pancreas, brain, hypothalamus, skeletal muscle, DRG (dorsal root ganglion) and skin. Moderate expression also noted in normal and tumor pairs of breast, ovarian, prostate, colon and lung tissue along with fibrotic liver, normal heart and diseased heart (CHF, congestive heart failure) samples. TABLE 29 Tissue Distribution of 46508 mRNA by TaqMan Analysis Tissue Type Expression Artery normal 8.6685 Aorta diseased 5.2992 Vein normal 1.7725 Coronary SMC 17.1577 HUVEC 49.0365 Hemangioma 5.0834 Heart normal 8.2009 Heart CHF 7.1393 Kidney 9.4204 Skeletal Muscle 16.6308 Adipose normal 3.9608 Pancreas 20.3335 primary osteoblasts 2.7431 Osteoclasts (diff) 0.8955 Skin normal 13.0031 Spinal cord normal 5.1365 Brain Cortex normal 24.2647 Brain Hypothalamus normal 25.2951 Nerve 7.8942 DRG (Dorsal Root Ganglion) 16.5159 Breast normal 6.5016 Breast tumor 11.4382 Ovary normal 10.0965 Ovary Tumor 6.7542 Prostate Normal 10.273 Prostate Tumor 11.0485 Salivary glands 2.1822 Colon normal 3.5327 Colon Tumor 14.9885 Lung normal 4.9273 Lung tumor 7.3655 Lung COPD 5.0658 Colon IBD 3.14 Liver normal 9.3229 Liver fibrosis 8.6385 Spleen normal 2.4129 Tonsil normal 2.83 Lymph node normal 5.3176 Small intestine normal 2.0573 Macrophages 1.3526 Synovium 2.3388 BM-MNC 0.2563 Activated PBMC 1.5538 Neutrophils 1.5755 Megakaryocytes 1.57 Erythroid 15.3566 positive control 21.1969

[6020] Table 30 below also shows expression of 46508 mRNA in various normal and diseased tissues, detected using TaqMan analysis. Table 30 shows the high transcriptional expression of 46508 in tumor samples compared to normal organ matched controls. High transcriptional expression was noted in 5/5 primary ovarian tumors, 4/4 primary colon tumors, 2/2 colon to liver metastases, 3/6 primary lung tumors and proliferating HMVEC cells when compared to arrested HMVEC cells. TABLE 30 Expression of 46508 mRNA in Normal and Cancerous Tissues Tissue Type Expression PIT 400 Breast Normal 28.36 PIT 372 Breast Normal 36.40 CHT 1228 Breast Normal 8.88 MDA 304 Breast Tumor: MD-IDC 8.34 CHT 2002 Breast Tumor: IDC 3.55 MDA 236-Breast Tumor: PD-IDC 3.21 CHT 562 Breast Tumor: IDC 13.51 NDR 138 Breast Tumor ILC (LG) 25.30 CHT 1841 Lymph node (Breast met) 7.19 PIT 58 Lung (Breast met) 12.22 CHT 620 Ovary Normal 14.83 CHT 619 Ovary Normal 7.09 CLN 012 Ovary Tumor 40.53 CLN 07 Ovary Tumor 28.26 CLN 17 Ovary Tumor 94.40 MDA 25 Ovary Tumor 80.21 CLN 08 Ovary Tumor 28.07 PIT 298 Lung Normal 2.14 MDA 185 Lung Normal 6.05 CLN 930 Lung Normal 12.13 MPI 215 Lung Tumor --SmC 9.69 MDA 259 Lung Tumor -PDNSCCL 13.94 CHT 832 Lung Tumor -PDNSCCL 8.64 MDA 262 Lung Tumor -SCC 68.87 CHT 793 Lung Tumor -ACA 20.83 CHT 331 Lung Tumor -ACA 8.14 CHT 405 Colon Normal 2.50 CHT 1685 Colon Normal 2.10 CHT 371 Colon Normal 0.95 CHT 382 Colon Tumor: MD 69.11 CHT 528 Colon Tumor: MD 54.79 CLN 609 Colon Tumor 15.25 NDR 210 Colon Tumor: MD-PD 121.16 CHT 340 Colon-Liver Met 15.25 CHT 1637 Colon-Liver Met 10.82 PIT 260 Liver N (female) 1.46 CHT 1653 Cervix Squamous CC 19.51 CHT 569 Cervix Squamous CC 1.31 A24 HMVEC-Arrested 11.56 C48 HMVEC-Proliferating 31.80 Pooled Hemangiomas 1.16 HCT116N22 Normoxic 76.42 HCT116H22 Hypoxic 68.39

[6021] Table 31 indicates that 46508 mRNA is highly expressed in several ovarian cell lines including SKOV3/Var, A2780, MDA2774 and ES-2. The table allows comparisons between two normal ovarian surface epithelium samples (MDA 127 Normal Ovary and MDA 224 Normal Ovary) and two ovarian ascites (MDA 124 Ovarian Ascites and MDA 126 Ovarian Ascites) samples. Expression of 46508 mRNA is upregulated in one of the ascites samples. The table also shows an experiment where the ovarian cancer cell line, HEY, was serum starved for 24 hours. Time points were taken at 0, 1, 3, 6, 9 and 12 hours after the addition of 10% serum (HEY 0 hr, HEY 1 hr, HEY 3 hr, HEY 6 hr, HEY 9 hr, and HEY 12 hr, respectively). Since cMyc protein is highly upregulated at 1 hour after addition of serum and phosphorylated at 6 hours, the experiment is a good model for identifying targets that are downstream of cMyc. These data indicate that 46508 mRNA may be regulated in a manner similar to cMyc since the expression increases from 1 to 9 hours after the addition of serum.

[6022] Also shown are data involving the ovarian cancer cell lines SKOV3 and SKOV3/Variant. These cell lines were grown in three different cellular environments: on plastic, in soft agar, and as a subcutaneous tumor in nude mice (all cells grown in 10% serum). The plastic sample was used as the “control” in each experiment. The SKOV3/Var cell line is a variant of the parental cell line SKOV3 which is resistant to cisplatin. These data indicate that 46508 mRNA is upregulated in environments that may be more similar to the tumor in vivo (the soft agar and subcutaneous tumor) compared to growth on plastic. TABLE 31 Expression in Various Ovarian Cells Tissue Type Expression SKOV-3 No GF 40.67 SKOV-3 EGF ′15 41.52 SKOV-3 EGF ′30 46.23 SKOV-3 EGF ′60 38.88 SKOV-3 Hrg ′15 36.02 SKOV-3 Hrg ′30 41.67 SKOV-3 Hrg ′60 48.87 SKOV-3 Serum ′30 54.98 SKOV-3var No GF 151.25 SKOV-3var EGF ′15 140.63 SKOV-3var EGF ′30 126.74 SKOV-3var EGF ′60 125.43 SKOV-3var Hrg ′15 141.61 SKOV-3var Hrg ′30 170.76 SKOV-3var Hrg ′60 140.15 SKOV-3var Serum ′30 190.78 HEY Plastic 50.94 HEY Soft Agar 22.96 SKOV-3 35.65 SKOV-3var 122.00 A2780 159.87 A2780-ADR 60.79 OVCAR-3 54.98 OVCAR-4 59.75 MDA2774 123.28 DOV13 36.52 Caov-3 18.14 ES-2 101.18 HEY 0 hr 55.17 HEY 1 hr 62.50 HEY 3 hr 74.84 HEY 6 hr 69.59 HEY 9 hr 77.75 HEY 12 hr 68.63 SKOV-3 SubQ Tumor 18.01 SKOV-3 Variant Plastic 152.30 SKOV-3 Var SubQ Tumor 9.39 MDA 127 Normal Ovary 11.44 MDA 224 Normal Ovary 17.10 MDA 124 Ovarian Ascites 18.84 MDA 126 Ovarian Ascites 41.67 HEY 63.81 SKOV-3 Plastic 72.80

Example 60 Tissue Distribution of 46508 mRNA by Northern Analysis

[6023] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2×SSC at 65° C. A DNA probe corresponding to all or a portion of the 46508 cDNA (SEQ ID NO:101) can be used. The DNA was radioactively labeled with ³²P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 61 Recombinant Expression of 46508 in Bacterial Cells

[6024] In this example, 46508 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 46508 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-46508 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB 199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 62 Expression of Recombinant 46508 Protein in COS Cells

[6025] To express the 46508 gene in COS cells (e.g., COS-7 cells, CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182), the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 46508 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[6026] To construct the plasmid, the 46508 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 46508 coding sequence starting from the initiation codon; the 3′end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 46508 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 46508_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5α, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[6027] COS cells are subsequently transfected with the 46508-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 46508 polypeptide is detected by radiolabelling (³⁵S-methionine or ³⁵S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with ³⁵S-methionine (or ³⁵S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[6028] Alternatively, DNA containing the 46508 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 46508 polypeptide is detected by radiolabelling and immunoprecipitation using a 46508 specific monoclonal antibody.

[6029] Equivalents

[6030] Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. Such equivalents are intended to be encompassed by the following claims.

1 104 1 1888 DNA Homo sapiens CDS (91)...(1344) 1 gctgaagcgg ggtaattcct ctcctgcaat tacttttgga tggaagtatg cccctttctc 60 agtagaagat ggtaatcttg gagaatgacc atg gag aag ggg atg agt tct gga 114 Met Glu Lys Gly Met Ser Ser Gly 1 5 gaa ggg ctg cct tcc aga tca tct cag gtt tcg gct ggt aaa ata aca 162 Glu Gly Leu Pro Ser Arg Ser Ser Gln Val Ser Ala Gly Lys Ile Thr 10 15 20 gcc aaa gag ttg gaa aca aag cag tcc tat aaa gag aaa cga gga ggc 210 Ala Lys Glu Leu Glu Thr Lys Gln Ser Tyr Lys Glu Lys Arg Gly Gly 25 30 35 40 ttt gtg ttg gtg cat gca ggt gca ggt tat cat tct gaa tcc aaa gcc 258 Phe Val Leu Val His Ala Gly Ala Gly Tyr His Ser Glu Ser Lys Ala 45 50 55 aag gag tat aaa cat gta tgc aaa cga gct tgt cag aag gca att gaa 306 Lys Glu Tyr Lys His Val Cys Lys Arg Ala Cys Gln Lys Ala Ile Glu 60 65 70 aag ctg cag gcc ggt gct ctt gca act gac gca gtc act gca gca ctg 354 Lys Leu Gln Ala Gly Ala Leu Ala Thr Asp Ala Val Thr Ala Ala Leu 75 80 85 gtg gaa ctt gag gat tct cct ttt aca aat gca gga atg gga tct aat 402 Val Glu Leu Glu Asp Ser Pro Phe Thr Asn Ala Gly Met Gly Ser Asn 90 95 100 cta aat ctg tta ggt gaa att gag tgt gat gcc agc ata atg gat gga 450 Leu Asn Leu Leu Gly Glu Ile Glu Cys Asp Ala Ser Ile Met Asp Gly 105 110 115 120 aaa tcc tta aat ttt gga gca gtt gga gca ctg agt gga atc aag aac 498 Lys Ser Leu Asn Phe Gly Ala Val Gly Ala Leu Ser Gly Ile Lys Asn 125 130 135 cca gtc tcg gtt gcc aac aga ctc tta tgt gaa ggg cag aag ggc aag 546 Pro Val Ser Val Ala Asn Arg Leu Leu Cys Glu Gly Gln Lys Gly Lys 140 145 150 ctc tcg gct ggc aga att cct ccc tgc ttt tta gtt gga gaa gga gcc 594 Leu Ser Ala Gly Arg Ile Pro Pro Cys Phe Leu Val Gly Glu Gly Ala 155 160 165 tac aga tgg gca gta gat cat gga ata ccc tct tgc cct cct aac atc 642 Tyr Arg Trp Ala Val Asp His Gly Ile Pro Ser Cys Pro Pro Asn Ile 170 175 180 atg acc aca aga ttc agt tta gct gca ttt aaa aga aac aag agg aaa 690 Met Thr Thr Arg Phe Ser Leu Ala Ala Phe Lys Arg Asn Lys Arg Lys 185 190 195 200 cta gag ctg gca gaa agg gtg gac aca gat ttt atg caa cta aag aaa 738 Leu Glu Leu Ala Glu Arg Val Asp Thr Asp Phe Met Gln Leu Lys Lys 205 210 215 aga aga caa tca agt gag aag gaa aat gac tca ggc act ttg gac acg 786 Arg Arg Gln Ser Ser Glu Lys Glu Asn Asp Ser Gly Thr Leu Asp Thr 220 225 230 gta ggc gct gtg gtt gtg gac cac gaa ggg aat gtt gct gct gct gtc 834 Val Gly Ala Val Val Val Asp His Glu Gly Asn Val Ala Ala Ala Val 235 240 245 tcc agt gga ggc ttg gcc ttg aaa cat ccg ggg aga gtt ggg cag gct 882 Ser Ser Gly Gly Leu Ala Leu Lys His Pro Gly Arg Val Gly Gln Ala 250 255 260 gct ctt tat gga tgt ggc tgc tgg gct gaa aat act gga gct cat aac 930 Ala Leu Tyr Gly Cys Gly Cys Trp Ala Glu Asn Thr Gly Ala His Asn 265 270 275 280 ccc tac tcc aca gct gtg agt acc tca gga tgt gga gag cat ctt gtg 978 Pro Tyr Ser Thr Ala Val Ser Thr Ser Gly Cys Gly Glu His Leu Val 285 290 295 cgc acc ata ctg gct aga gaa tgt tca cat gct tta caa gct gag gat 1026 Arg Thr Ile Leu Ala Arg Glu Cys Ser His Ala Leu Gln Ala Glu Asp 300 305 310 gct cac caa gcc ctg ttg gag act atg caa aac aag ttt atc agt tca 1074 Ala His Gln Ala Leu Leu Glu Thr Met Gln Asn Lys Phe Ile Ser Ser 315 320 325 cct ttc ctt gcc agt gaa gat ggc gtg ctt ggc gga gtg att gtc ctc 1122 Pro Phe Leu Ala Ser Glu Asp Gly Val Leu Gly Gly Val Ile Val Leu 330 335 340 cgt tca tgc aga tgt tct gcc gag cct gac tcc tcc caa aat aag cag 1170 Arg Ser Cys Arg Cys Ser Ala Glu Pro Asp Ser Ser Gln Asn Lys Gln 345 350 355 360 aca ctt cta gtg gaa ttt ctg tgg agc cac acg acg gag agc atg tgt 1218 Thr Leu Leu Val Glu Phe Leu Trp Ser His Thr Thr Glu Ser Met Cys 365 370 375 gtc gga tat atg tca gcc cag gat ggg aaa gcc aag act cac att tca 1266 Val Gly Tyr Met Ser Ala Gln Asp Gly Lys Ala Lys Thr His Ile Ser 380 385 390 aga ctt cct cct ggt gcg gtg gca gga cag tct gtg gca atc gaa ggt 1314 Arg Leu Pro Pro Gly Ala Val Ala Gly Gln Ser Val Ala Ile Glu Gly 395 400 405 ggg gtg tgc cgc ctg gag agc cca gtg aac tgacccttca ggctgagtgt 1364 Gly Val Cys Arg Leu Glu Ser Pro Val Asn 410 415 gaagcgtctc agaggcattt cagaacctga gcttttgggg gtttttaact gaagttggtt 1424 gttttatctt tcttgtttta taattcctat tgcaacctcg tgcactgctc gagacacaag 1484 tgctgctgta gttagcgctt agtgacacgc gggcctttgg tgggtgagcg ggactgtgtg 1544 tgagtgtgtg cgcgtatgtg cgcacatatg tgtatgtgtg gagtatgtgt gtttgcttct 1604 ccgtggatga aatagaaact cctcattgtg tgaccaggaa tggttaaatc atctttacaa 1664 aatgtgtgct ttaactgttt acaagtaaaa cctaaagttg caggaaacat tttttatttc 1724 gtaaagaggt accaactgtc gctgatgtga tatgtcagaa ctgaagagta aatctacttg 1784 tttaaatgac ttgacagtgg tagtgctcca tttaataaca gtaataagta ataaagtgtt 1844 tttatttgtt aaccaaaaaa aaaaaaaaaa aaagggcggc cgct 1888 2 418 PRT Homo sapiens 2 Met Glu Lys Gly Met Ser Ser Gly Glu Gly Leu Pro Ser Arg Ser Ser 1 5 10 15 Gln Val Ser Ala Gly Lys Ile Thr Ala Lys Glu Leu Glu Thr Lys Gln 20 25 30 Ser Tyr Lys Glu Lys Arg Gly Gly Phe Val Leu Val His Ala Gly Ala 35 40 45 Gly Tyr His Ser Glu Ser Lys Ala Lys Glu Tyr Lys His Val Cys Lys 50 55 60 Arg Ala Cys Gln Lys Ala Ile Glu Lys Leu Gln Ala Gly Ala Leu Ala 65 70 75 80 Thr Asp Ala Val Thr Ala Ala Leu Val Glu Leu Glu Asp Ser Pro Phe 85 90 95 Thr Asn Ala Gly Met Gly Ser Asn Leu Asn Leu Leu Gly Glu Ile Glu 100 105 110 Cys Asp Ala Ser Ile Met Asp Gly Lys Ser Leu Asn Phe Gly Ala Val 115 120 125 Gly Ala Leu Ser Gly Ile Lys Asn Pro Val Ser Val Ala Asn Arg Leu 130 135 140 Leu Cys Glu Gly Gln Lys Gly Lys Leu Ser Ala Gly Arg Ile Pro Pro 145 150 155 160 Cys Phe Leu Val Gly Glu Gly Ala Tyr Arg Trp Ala Val Asp His Gly 165 170 175 Ile Pro Ser Cys Pro Pro Asn Ile Met Thr Thr Arg Phe Ser Leu Ala 180 185 190 Ala Phe Lys Arg Asn Lys Arg Lys Leu Glu Leu Ala Glu Arg Val Asp 195 200 205 Thr Asp Phe Met Gln Leu Lys Lys Arg Arg Gln Ser Ser Glu Lys Glu 210 215 220 Asn Asp Ser Gly Thr Leu Asp Thr Val Gly Ala Val Val Val Asp His 225 230 235 240 Glu Gly Asn Val Ala Ala Ala Val Ser Ser Gly Gly Leu Ala Leu Lys 245 250 255 His Pro Gly Arg Val Gly Gln Ala Ala Leu Tyr Gly Cys Gly Cys Trp 260 265 270 Ala Glu Asn Thr Gly Ala His Asn Pro Tyr Ser Thr Ala Val Ser Thr 275 280 285 Ser Gly Cys Gly Glu His Leu Val Arg Thr Ile Leu Ala Arg Glu Cys 290 295 300 Ser His Ala Leu Gln Ala Glu Asp Ala His Gln Ala Leu Leu Glu Thr 305 310 315 320 Met Gln Asn Lys Phe Ile Ser Ser Pro Phe Leu Ala Ser Glu Asp Gly 325 330 335 Val Leu Gly Gly Val Ile Val Leu Arg Ser Cys Arg Cys Ser Ala Glu 340 345 350 Pro Asp Ser Ser Gln Asn Lys Gln Thr Leu Leu Val Glu Phe Leu Trp 355 360 365 Ser His Thr Thr Glu Ser Met Cys Val Gly Tyr Met Ser Ala Gln Asp 370 375 380 Gly Lys Ala Lys Thr His Ile Ser Arg Leu Pro Pro Gly Ala Val Ala 385 390 395 400 Gly Gln Ser Val Ala Ile Glu Gly Gly Val Cys Arg Leu Glu Ser Pro 405 410 415 Val Asn 3 1257 DNA Homo sapiens 3 atggagaagg ggatgagttc tggagaaggg ctgccttcca gatcatctca ggtttcggct 60 ggtaaaataa cagccaaaga gttggaaaca aagcagtcct ataaagagaa acgaggaggc 120 tttgtgttgg tgcatgcagg tgcaggttat cattctgaat ccaaagccaa ggagtataaa 180 catgtatgca aacgagcttg tcagaaggca attgaaaagc tgcaggccgg tgctcttgca 240 actgacgcag tcactgcagc actggtggaa cttgaggatt ctccttttac aaatgcagga 300 atgggatcta atctaaatct gttaggtgaa attgagtgtg atgccagcat aatggatgga 360 aaatccttaa attttggagc agttggagca ctgagtggaa tcaagaaccc agtctcggtt 420 gccaacagac tcttatgtga agggcagaag ggcaagctct cggctggcag aattcctccc 480 tgctttttag ttggagaagg agcctacaga tgggcagtag atcatggaat accctcttgc 540 cctcctaaca tcatgaccac aagattcagt ttagctgcat ttaaaagaaa caagaggaaa 600 ctagagctgg cagaaagggt ggacacagat tttatgcaac taaagaaaag aagacaatca 660 agtgagaagg aaaatgactc aggcactttg gacacggtag gcgctgtggt tgtggaccac 720 gaagggaatg ttgctgctgc tgtctccagt ggaggcttgg ccttgaaaca tccggggaga 780 gttgggcagg ctgctcttta tggatgtggc tgctgggctg aaaatactgg agctcataac 840 ccctactcca cagctgtgag tacctcagga tgtggagagc atcttgtgcg caccatactg 900 gctagagaat gttcacatgc tttacaagct gaggatgctc accaagccct gttggagact 960 atgcaaaaca agtttatcag ttcacctttc cttgccagtg aagatggcgt gcttggcgga 1020 gtgattgtcc tccgttcatg cagatgttct gccgagcctg actcctccca aaataagcag 1080 acacttctag tggaatttct gtggagccac acgacggaga gcatgtgtgt cggatatatg 1140 tcagcccagg atgggaaagc caagactcac atttcaagac ttcctcctgg tgcggtggca 1200 ggacagtctg tggcaatcga aggtggggtg tgccgcctgg agagcccagt gaactga 1257 4 1358 DNA Homo sapiens CDS (134)...(1057) 4 tccgagagcg gtggcgggct gagcggttac gagccggcgt cggggagcgg cggtaccggg 60 cggctgcggg gctggctcga cccagcttga ggtctcggcg tccgcgtcct gcggtgccct 120 gggatccgcc gac atg aat ccc atc gta gtg gtc cac ggc ggc gga gcc 169 Met Asn Pro Ile Val Val Val His Gly Gly Gly Ala 1 5 10 ggt ccc atc tcc aag gat cgg aag gag cga gtg cac cag ggc atg gtc 217 Gly Pro Ile Ser Lys Asp Arg Lys Glu Arg Val His Gln Gly Met Val 15 20 25 aga gcc gcc acc gtg ggc tac ggc atc ctc cgg gag ggc ggg agc gcc 265 Arg Ala Ala Thr Val Gly Tyr Gly Ile Leu Arg Glu Gly Gly Ser Ala 30 35 40 gtg gat gcc gta gag gga gct gtc gtc gcc ctg gaa gac gat ccc gag 313 Val Asp Ala Val Glu Gly Ala Val Val Ala Leu Glu Asp Asp Pro Glu 45 50 55 60 ttc aac gca ggt tgt ggg tct gtc ttg aac aca aat ggt gag gtt gaa 361 Phe Asn Ala Gly Cys Gly Ser Val Leu Asn Thr Asn Gly Glu Val Glu 65 70 75 atg gat gct agt atc atg gat gga aaa gac ctg tct gca gga gca gtg 409 Met Asp Ala Ser Ile Met Asp Gly Lys Asp Leu Ser Ala Gly Ala Val 80 85 90 tcc gca gtc cag tgt ata gca aat ccc att aaa ctt gct cgg ctt gtc 457 Ser Ala Val Gln Cys Ile Ala Asn Pro Ile Lys Leu Ala Arg Leu Val 95 100 105 atg gaa aag aca cct cat tgc ttt ctg act gac caa ggc gca gcg cag 505 Met Glu Lys Thr Pro His Cys Phe Leu Thr Asp Gln Gly Ala Ala Gln 110 115 120 ttt gca gca gct atg ggg gtt cca gag att cct gga gaa aaa ctg gtg 553 Phe Ala Ala Ala Met Gly Val Pro Glu Ile Pro Gly Glu Lys Leu Val 125 130 135 140 aca gag aga aac aaa aag cgc ctg gaa aaa gag aag cat gaa aaa ggt 601 Thr Glu Arg Asn Lys Lys Arg Leu Glu Lys Glu Lys His Glu Lys Gly 145 150 155 gct cag aaa aca gat tgt caa aaa aac ttg gga acc gtg ggt gct gtt 649 Ala Gln Lys Thr Asp Cys Gln Lys Asn Leu Gly Thr Val Gly Ala Val 160 165 170 gcc ttg gac tgc aaa ggg aat gta gcc tac gca acc tcc aca ggc ggt 697 Ala Leu Asp Cys Lys Gly Asn Val Ala Tyr Ala Thr Ser Thr Gly Gly 175 180 185 atc gtt aat aaa atg gtc ggc cgc gtt ggg gac tca ccg tgt cta gga 745 Ile Val Asn Lys Met Val Gly Arg Val Gly Asp Ser Pro Cys Leu Gly 190 195 200 gct gga ggt tat gcc gac aat gac atc gga gcc gtc tca acc aca ggg 793 Ala Gly Gly Tyr Ala Asp Asn Asp Ile Gly Ala Val Ser Thr Thr Gly 205 210 215 220 cat ggg gaa agc atc ctg aag gtg aac ctg gct aga ctc acc ctg ttc 841 His Gly Glu Ser Ile Leu Lys Val Asn Leu Ala Arg Leu Thr Leu Phe 225 230 235 cac ata gaa caa gga aag acg gta gaa gag gct gcg gac cta tcg ttg 889 His Ile Glu Gln Gly Lys Thr Val Glu Glu Ala Ala Asp Leu Ser Leu 240 245 250 ggt tat atg aag tca agg gtt aaa ggt tta ggt ggc ctc atc gtg gtt 937 Gly Tyr Met Lys Ser Arg Val Lys Gly Leu Gly Gly Leu Ile Val Val 255 260 265 agc aaa aca gga gac tgg gtg gca aag tgg acc tcc acc tcc atg ccc 985 Ser Lys Thr Gly Asp Trp Val Ala Lys Trp Thr Ser Thr Ser Met Pro 270 275 280 tgg gca gcc gcc aag gac ggc aag ctg cac ttc gga att gat cct gac 1033 Trp Ala Ala Ala Lys Asp Gly Lys Leu His Phe Gly Ile Asp Pro Asp 285 290 295 300 gat act act atc acc gac ctt ccc taagccgctg gaagattgta ttccagatgc 1087 Asp Thr Thr Ile Thr Asp Leu Pro 305 tagcttagag gtcaagtaca gtctcctcat gagacatagc ctaatcaatt agatctagaa 1147 ttggaaaaat tgtcccgtct gtcacttgtt ttgttgcctt aataagcatc tgaatgtttg 1207 gttgtggggc gggttctgaa gcgatgagag aaatgcccgt attaggagga ttacttgagc 1267 cctggaggtc aaagctgagg tgagccatga ttactccact gcactccagc ctgggcaaca 1327 gagccaggcc ctgtatcaaa aaaaaaaaaa a 1358 5 308 PRT Homo sapiens 5 Met Asn Pro Ile Val Val Val His Gly Gly Gly Ala Gly Pro Ile Ser 1 5 10 15 Lys Asp Arg Lys Glu Arg Val His Gln Gly Met Val Arg Ala Ala Thr 20 25 30 Val Gly Tyr Gly Ile Leu Arg Glu Gly Gly Ser Ala Val Asp Ala Val 35 40 45 Glu Gly Ala Val Val Ala Leu Glu Asp Asp Pro Glu Phe Asn Ala Gly 50 55 60 Cys Gly Ser Val Leu Asn Thr Asn Gly Glu Val Glu Met Asp Ala Ser 65 70 75 80 Ile Met Asp Gly Lys Asp Leu Ser Ala Gly Ala Val Ser Ala Val Gln 85 90 95 Cys Ile Ala Asn Pro Ile Lys Leu Ala Arg Leu Val Met Glu Lys Thr 100 105 110 Pro His Cys Phe Leu Thr Asp Gln Gly Ala Ala Gln Phe Ala Ala Ala 115 120 125 Met Gly Val Pro Glu Ile Pro Gly Glu Lys Leu Val Thr Glu Arg Asn 130 135 140 Lys Lys Arg Leu Glu Lys Glu Lys His Glu Lys Gly Ala Gln Lys Thr 145 150 155 160 Asp Cys Gln Lys Asn Leu Gly Thr Val Gly Ala Val Ala Leu Asp Cys 165 170 175 Lys Gly Asn Val Ala Tyr Ala Thr Ser Thr Gly Gly Ile Val Asn Lys 180 185 190 Met Val Gly Arg Val Gly Asp Ser Pro Cys Leu Gly Ala Gly Gly Tyr 195 200 205 Ala Asp Asn Asp Ile Gly Ala Val Ser Thr Thr Gly His Gly Glu Ser 210 215 220 Ile Leu Lys Val Asn Leu Ala Arg Leu Thr Leu Phe His Ile Glu Gln 225 230 235 240 Gly Lys Thr Val Glu Glu Ala Ala Asp Leu Ser Leu Gly Tyr Met Lys 245 250 255 Ser Arg Val Lys Gly Leu Gly Gly Leu Ile Val Val Ser Lys Thr Gly 260 265 270 Asp Trp Val Ala Lys Trp Thr Ser Thr Ser Met Pro Trp Ala Ala Ala 275 280 285 Lys Asp Gly Lys Leu His Phe Gly Ile Asp Pro Asp Asp Thr Thr Ile 290 295 300 Thr Asp Leu Pro 305 6 927 DNA Homo sapiens 6 atgaatccca tcgtagtggt ccacggcggc ggagccggtc ccatctccaa ggatcggaag 60 gagcgagtgc accagggcat ggtcagagcc gccaccgtgg gctacggcat cctccgggag 120 ggcgggagcg ccgtggatgc cgtagaggga gctgtcgtcg ccctggaaga cgatcccgag 180 ttcaacgcag gttgtgggtc tgtcttgaac acaaatggtg aggttgaaat ggatgctagt 240 atcatggatg gaaaagacct gtctgcagga gcagtgtccg cagtccagtg tatagcaaat 300 cccattaaac ttgctcggct tgtcatggaa aagacacctc attgctttct gactgaccaa 360 ggcgcagcgc agtttgcagc agctatgggg gttccagaga ttcctggaga aaaactggtg 420 acagagagaa acaaaaagcg cctggaaaaa gagaagcatg aaaaaggtgc tcagaaaaca 480 gattgtcaaa aaaacttggg aaccgtgggt gctgttgcct tggactgcaa agggaatgta 540 gcctacgcaa cctccacagg cggtatcgtt aataaaatgg tcggccgcgt tggggactca 600 ccgtgtctag gagctggagg ttatgccgac aatgacatcg gagccgtctc aaccacaggg 660 catggggaaa gcatcctgaa ggtgaacctg gctagactca ccctgttcca catagaacaa 720 ggaaagacgg tagaagaggc tgcggaccta tcgttgggtt atatgaagtc aagggttaaa 780 ggtttaggtg gcctcatcgt ggttagcaaa acaggagact gggtggcaaa gtggacctcc 840 acctccatgc cctgggcagc cgccaaggac ggcaagctgc acttcggaat tgatcctgac 900 gatactacta tcaccgacct tccctaa 927 7 378 PRT Artificial Sequence consensus sequence 7 Gly Thr Leu Leu Ile Ala Ile His Gly Leu Glu Gly Ala Gly Asp Ile 1 5 10 15 Asp Ser Ser Glu Pro Lys Thr Thr Asn Leu Pro Leu Val Leu Thr Thr 20 25 30 Trp Arg Ser Glu Ala Leu Lys His Ala Val Glu Ala Ala Trp Lys Ala 35 40 45 Leu Lys Ala Gly Gly Ser Ala Leu Asp Ala Val Glu Lys Gly Val Arg 50 55 60 Leu Leu Glu Asn Glu Pro Cys Asp Phe Asn Ala Gly Tyr Gly Gly Val 65 70 75 80 Leu Asp Glu Asp Gly Thr Val Glu Leu Asp Ala Ser Ile Met Asp Gly 85 90 95 Asn Thr Ser Ser Ser Met Val Val Ile Glu Asn Ile Phe Cys Arg Asp 100 105 110 Gly Met Lys Val Gly Ala Val Ala Gly Leu Ser Arg Ile Lys Asn Pro 115 120 125 Ile Ser Val Ala Arg Leu Val Met Glu Lys Thr Pro His Ile Leu Leu 130 135 140 Val Gly Glu Gly Ala Glu Glu Phe Ala Lys Ser Gln Gly Phe Glu Thr 145 150 155 160 Glu Asp Leu Ser Thr Phe Glu Thr Gln Glu Trp Ile Glu Glu Trp Leu 165 170 175 Ala Ala Lys Glu Gln Lys Asn Tyr Trp Lys Arg Val Ile Leu Asp Pro 180 185 190 Ser Val Tyr Cys Gly Pro Tyr Lys Thr Pro Gly Leu Leu Lys Ser Glu 195 200 205 Arg Asp Ile Pro Leu Asp Asn Glu Asp Ser Glu Ala Gly Tyr Leu Val 210 215 220 Asp Asp Arg Gln His Gly Thr Ile Gly Met Val Ala Leu Asp Ala Glu 225 230 235 240 Gly Asn Leu Ala Ala Ala Thr Ser Thr Gly Gly Met Val Asn Lys Met 245 250 255 His Gly Arg Val Gly Asp Ser Pro Ile Ile Gly Ala Gly Ala Tyr Ala 260 265 270 Asn Asn Phe Ala Gly Ala Val Ser Ala Thr Gly Lys Gly Glu Val Ile 275 280 285 Ile Arg Ala Leu Pro Ala Tyr Asp Val Val Ala Leu Met Glu Tyr Gly 290 295 300 Gly Lys Pro Leu Ser Leu Ala Glu Ala Ala Ala Lys Arg Ile Thr Lys 305 310 315 320 Ala Leu Pro Lys Arg Gly Lys Asn Leu Lys Asp Gly Ser Gly Gly Leu 325 330 335 Ile Ala Leu Asn His Lys Gly Glu Ile Ala Ala Pro Cys Asn Thr Thr 340 345 350 Gly Met Phe Arg Ala Ala His Thr Ala Thr Glu Asp Gly Thr Thr Leu 355 360 365 Glu Tyr Ser Glu Ile Gly Ile Trp Glu Lys 370 375 8 9 PRT Artificial Sequence exemplary motif 8 Xaa Xaa Xaa Thr Gly Gly Thr Xaa Xaa 1 5 9 11 PRT Artificial Sequence exemplary motif 9 Gly Xaa Xaa Xaa Xaa His Gly Thr Asp Thr Xaa 1 5 10 10 1937 DNA Homo sapiens CDS (322)...(1701) 10 ggctgctggg ctggcggggc gcaggccgcg ggacccgagc ccggggaagc gagagagcgg 60 aggcgccgag gatccgattc actccctggg gagacctatg ggccgaagcc gtgtaaatgc 120 gttttaagca gaggcctcgg ctccgcaact gccactcctc ctcggggtgt tgcacaagtt 180 tcgaggtcac cggcgacccc ccctagcagc gcgcctggct ctggcccccg cgaaggagga 240 cggagtttgt gtgttgcata ctttctaagg cggcggctgc agcagcggct ccatccagcc 300 cgtcagctcc tcctgcaagg c atg gct ggc tac ctg agt gaa tcg gac ttt 351 Met Ala Gly Tyr Leu Ser Glu Ser Asp Phe 1 5 10 gtg atg gtg gag gag ggc ttc agt acc cga gac ctg ctg aag gaa ctc 399 Val Met Val Glu Glu Gly Phe Ser Thr Arg Asp Leu Leu Lys Glu Leu 15 20 25 act ctg ggg gcc tca cag gcc acc acg gac gag gta gct gcc ttc ttc 447 Thr Leu Gly Ala Ser Gln Ala Thr Thr Asp Glu Val Ala Ala Phe Phe 30 35 40 gtg gct gac ctg ggt gcc ata gtg agg aag cac ttt tgc ttt ctg aag 495 Val Ala Asp Leu Gly Ala Ile Val Arg Lys His Phe Cys Phe Leu Lys 45 50 55 tgc ctg cca cga gtc cgg ccc ttt tat gct gtc aag tgc aac agc agc 543 Cys Leu Pro Arg Val Arg Pro Phe Tyr Ala Val Lys Cys Asn Ser Ser 60 65 70 cca ggt gtg ctg aag gtt ctg gcc cag ctg ggg ctg ggc ttt agc tgt 591 Pro Gly Val Leu Lys Val Leu Ala Gln Leu Gly Leu Gly Phe Ser Cys 75 80 85 90 gcc aac aag gca gag atg gag ttg gtc cag cat att gga atc cct gcc 639 Ala Asn Lys Ala Glu Met Glu Leu Val Gln His Ile Gly Ile Pro Ala 95 100 105 agt aag atc atc tgc gcc aac ccc tgt aag caa att gca cag atc aaa 687 Ser Lys Ile Ile Cys Ala Asn Pro Cys Lys Gln Ile Ala Gln Ile Lys 110 115 120 tat gct gcc aag cat ggg atc cag ctg ctg agc ttt gac aat gag atg 735 Tyr Ala Ala Lys His Gly Ile Gln Leu Leu Ser Phe Asp Asn Glu Met 125 130 135 gag ctg gca aag gtg gta aag agc cac ccc agt gcc aag atg gtt ctg 783 Glu Leu Ala Lys Val Val Lys Ser His Pro Ser Ala Lys Met Val Leu 140 145 150 tgc att gct acc gat gac tcc cac tcc ctg agc tgc ctg agc cta aag 831 Cys Ile Ala Thr Asp Asp Ser His Ser Leu Ser Cys Leu Ser Leu Lys 155 160 165 170 ttt gga gtg tca ctg aaa tcc tgc aga cac ctg ctt gaa aat gcg aag 879 Phe Gly Val Ser Leu Lys Ser Cys Arg His Leu Leu Glu Asn Ala Lys 175 180 185 aag cac cat gtg gag gtg gtg ggt gtg agt ttt cac att ggc agt ggc 927 Lys His His Val Glu Val Val Gly Val Ser Phe His Ile Gly Ser Gly 190 195 200 tgt cct gac cct cag gcc tat gct cag tcc atc gca gac gcc cgg ctc 975 Cys Pro Asp Pro Gln Ala Tyr Ala Gln Ser Ile Ala Asp Ala Arg Leu 205 210 215 gtg ttt gaa atg ggc acc gag ctg ggt cac aag atg cac gtt ctg gac 1023 Val Phe Glu Met Gly Thr Glu Leu Gly His Lys Met His Val Leu Asp 220 225 230 ctt ggt ggt ggc ttc cct ggc aca gaa ggg gcc aaa gtg aga ttt gaa 1071 Leu Gly Gly Gly Phe Pro Gly Thr Glu Gly Ala Lys Val Arg Phe Glu 235 240 245 250 gag att gct tcc gtg atc aac tca gcc ttg gac ctg tac ttc cca gag 1119 Glu Ile Ala Ser Val Ile Asn Ser Ala Leu Asp Leu Tyr Phe Pro Glu 255 260 265 ggc tgt ggc gtg gac atc ttt gct gag ctg ggg cgc tac tac gtg acc 1167 Gly Cys Gly Val Asp Ile Phe Ala Glu Leu Gly Arg Tyr Tyr Val Thr 270 275 280 tcg gcc ttc act gtg gca gtc agc atc att gcc aag aag gag gtt ctg 1215 Ser Ala Phe Thr Val Ala Val Ser Ile Ile Ala Lys Lys Glu Val Leu 285 290 295 cta gac cag cct ggc agg gag gag gaa aat ggt tcc acc tca aga ccc 1263 Leu Asp Gln Pro Gly Arg Glu Glu Glu Asn Gly Ser Thr Ser Arg Pro 300 305 310 atc gtg tac cac ctt gat gag ggc gtg tat ggg atc ttc aac tca gtc 1311 Ile Val Tyr His Leu Asp Glu Gly Val Tyr Gly Ile Phe Asn Ser Val 315 320 325 330 ctg ttt gac aac atc tgc cct acc ccc atc ctg cag aag aaa cca tcc 1359 Leu Phe Asp Asn Ile Cys Pro Thr Pro Ile Leu Gln Lys Lys Pro Ser 335 340 345 acg gag cag ccc ctg tac agc agc agc ctg tgg ggc ccg gcg gtt gat 1407 Thr Glu Gln Pro Leu Tyr Ser Ser Ser Leu Trp Gly Pro Ala Val Asp 350 355 360 ggc tgt gat tgc gtg gct gag ggc ctg tgg ctg ccg caa cta cac gta 1455 Gly Cys Asp Cys Val Ala Glu Gly Leu Trp Leu Pro Gln Leu His Val 365 370 375 ggg gac tgg ctg gtc ttt gac aac atg ggc gcc tac act gtg ggc atg 1503 Gly Asp Trp Leu Val Phe Asp Asn Met Gly Ala Tyr Thr Val Gly Met 380 385 390 ggt tcc ccc ttt tgg ggg acc cag gcc tgc cac atc acc tat gcc atg 1551 Gly Ser Pro Phe Trp Gly Thr Gln Ala Cys His Ile Thr Tyr Ala Met 395 400 405 410 tcc cgg gtg gcc tgg gaa gcg ctg cga agg cag ctg atg gct gca gaa 1599 Ser Arg Val Ala Trp Glu Ala Leu Arg Arg Gln Leu Met Ala Ala Glu 415 420 425 cag gag gat gac gtg gag ggt gtg tgc aag cct ctg tcc tgc ggc tgg 1647 Gln Glu Asp Asp Val Glu Gly Val Cys Lys Pro Leu Ser Cys Gly Trp 430 435 440 gag atc aca gac acc ctg tgc gtg ggc cct gtc ttc acc cca gcg agc 1695 Glu Ile Thr Asp Thr Leu Cys Val Gly Pro Val Phe Thr Pro Ala Ser 445 450 455 atc atg tgagtgggcc tcgttccccc cggagaatcc cagcggggcc tcagagatgc 1751 Ile Met 460 atctgggaga ggtgggcgag gcagcgagct ggtaccctct ggccaggact tctggtgctc 1811 gctctgccgc cccacgctcc acctgtagtg tttctgccct gtaaatagga ccagtcttac 1871 actcgcttgt agtttcaagt atgcaacata aatcctgtcc cttccaaaaa aaaaaaaaaa 1931 aaaaaa 1937 11 460 PRT Homo sapiens 11 Met Ala Gly Tyr Leu Ser Glu Ser Asp Phe Val Met Val Glu Glu Gly 1 5 10 15 Phe Ser Thr Arg Asp Leu Leu Lys Glu Leu Thr Leu Gly Ala Ser Gln 20 25 30 Ala Thr Thr Asp Glu Val Ala Ala Phe Phe Val Ala Asp Leu Gly Ala 35 40 45 Ile Val Arg Lys His Phe Cys Phe Leu Lys Cys Leu Pro Arg Val Arg 50 55 60 Pro Phe Tyr Ala Val Lys Cys Asn Ser Ser Pro Gly Val Leu Lys Val 65 70 75 80 Leu Ala Gln Leu Gly Leu Gly Phe Ser Cys Ala Asn Lys Ala Glu Met 85 90 95 Glu Leu Val Gln His Ile Gly Ile Pro Ala Ser Lys Ile Ile Cys Ala 100 105 110 Asn Pro Cys Lys Gln Ile Ala Gln Ile Lys Tyr Ala Ala Lys His Gly 115 120 125 Ile Gln Leu Leu Ser Phe Asp Asn Glu Met Glu Leu Ala Lys Val Val 130 135 140 Lys Ser His Pro Ser Ala Lys Met Val Leu Cys Ile Ala Thr Asp Asp 145 150 155 160 Ser His Ser Leu Ser Cys Leu Ser Leu Lys Phe Gly Val Ser Leu Lys 165 170 175 Ser Cys Arg His Leu Leu Glu Asn Ala Lys Lys His His Val Glu Val 180 185 190 Val Gly Val Ser Phe His Ile Gly Ser Gly Cys Pro Asp Pro Gln Ala 195 200 205 Tyr Ala Gln Ser Ile Ala Asp Ala Arg Leu Val Phe Glu Met Gly Thr 210 215 220 Glu Leu Gly His Lys Met His Val Leu Asp Leu Gly Gly Gly Phe Pro 225 230 235 240 Gly Thr Glu Gly Ala Lys Val Arg Phe Glu Glu Ile Ala Ser Val Ile 245 250 255 Asn Ser Ala Leu Asp Leu Tyr Phe Pro Glu Gly Cys Gly Val Asp Ile 260 265 270 Phe Ala Glu Leu Gly Arg Tyr Tyr Val Thr Ser Ala Phe Thr Val Ala 275 280 285 Val Ser Ile Ile Ala Lys Lys Glu Val Leu Leu Asp Gln Pro Gly Arg 290 295 300 Glu Glu Glu Asn Gly Ser Thr Ser Arg Pro Ile Val Tyr His Leu Asp 305 310 315 320 Glu Gly Val Tyr Gly Ile Phe Asn Ser Val Leu Phe Asp Asn Ile Cys 325 330 335 Pro Thr Pro Ile Leu Gln Lys Lys Pro Ser Thr Glu Gln Pro Leu Tyr 340 345 350 Ser Ser Ser Leu Trp Gly Pro Ala Val Asp Gly Cys Asp Cys Val Ala 355 360 365 Glu Gly Leu Trp Leu Pro Gln Leu His Val Gly Asp Trp Leu Val Phe 370 375 380 Asp Asn Met Gly Ala Tyr Thr Val Gly Met Gly Ser Pro Phe Trp Gly 385 390 395 400 Thr Gln Ala Cys His Ile Thr Tyr Ala Met Ser Arg Val Ala Trp Glu 405 410 415 Ala Leu Arg Arg Gln Leu Met Ala Ala Glu Gln Glu Asp Asp Val Glu 420 425 430 Gly Val Cys Lys Pro Leu Ser Cys Gly Trp Glu Ile Thr Asp Thr Leu 435 440 445 Cys Val Gly Pro Val Phe Thr Pro Ala Ser Ile Met 450 455 460 12 1383 DNA Homo sapiens 12 atggctggct acctgagtga atcggacttt gtgatggtgg aggagggctt cagtacccga 60 gacctgctga aggaactcac tctgggggcc tcacaggcca ccacggacga ggtagctgcc 120 ttcttcgtgg ctgacctggg tgccatagtg aggaagcact tttgctttct gaagtgcctg 180 ccacgagtcc ggccctttta tgctgtcaag tgcaacagca gcccaggtgt gctgaaggtt 240 ctggcccagc tggggctggg ctttagctgt gccaacaagg cagagatgga gttggtccag 300 catattggaa tccctgccag taagatcatc tgcgccaacc cctgtaagca aattgcacag 360 atcaaatatg ctgccaagca tgggatccag ctgctgagct ttgacaatga gatggagctg 420 gcaaaggtgg taaagagcca ccccagtgcc aagatggttc tgtgcattgc taccgatgac 480 tcccactccc tgagctgcct gagcctaaag tttggagtgt cactgaaatc ctgcagacac 540 ctgcttgaaa atgcgaagaa gcaccatgtg gaggtggtgg gtgtgagttt tcacattggc 600 agtggctgtc ctgaccctca ggcctatgct cagtccatcg cagacgcccg gctcgtgttt 660 gaaatgggca ccgagctggg tcacaagatg cacgttctgg accttggtgg tggcttccct 720 ggcacagaag gggccaaagt gagatttgaa gagattgctt ccgtgatcaa ctcagccttg 780 gacctgtact tcccagaggg ctgtggcgtg gacatctttg ctgagctggg gcgctactac 840 gtgacctcgg ccttcactgt ggcagtcagc atcattgcca agaaggaggt tctgctagac 900 cagcctggca gggaggagga aaatggttcc acctcaagac ccatcgtgta ccaccttgat 960 gagggcgtgt atgggatctt caactcagtc ctgtttgaca acatctgccc tacccccatc 1020 ctgcagaaga aaccatccac ggagcagccc ctgtacagca gcagcctgtg gggcccggcg 1080 gttgatggct gtgattgcgt ggctgagggc ctgtggctgc cgcaactaca cgtaggggac 1140 tggctggtct ttgacaacat gggcgcctac actgtgggca tgggttcccc cttttggggg 1200 acccaggcct gccacatcac ctatgccatg tcccgggtgg cctgggaagc gctgcgaagg 1260 cagctgatgg ctgcagaaca ggaggatgac gtggagggtg tgtgcaagcc tctgtcctgc 1320 ggctgggaga tcacagacac cctgtgcgtg ggccctgtct tcaccccagc gagcatcatg 1380 tga 1383 13 467 PRT Artificial Sequence consensus sequence 13 Phe Tyr Val Tyr Asp Leu Gly Leu His Ile Val Arg Arg Ile His Ala 1 5 10 15 Leu Trp Lys Ala Phe Leu Pro Arg Gly Gln Tyr Asn Ser Val Val Lys 20 25 30 Pro Phe Tyr Ala Val Lys Ala Asn Ser Asp Pro Ala Val Leu Arg Leu 35 40 45 Leu Ala Glu Leu Gly Thr His Ser Leu Gly Phe Asp Cys Ala Ser Lys 50 55 60 Gly Glu Leu Glu Arg Val Leu Ala Ala Tyr Leu Ala Gly Val Ser Pro 65 70 75 80 Glu Arg Ile Ile Phe Ala Asn Pro Cys Lys Ser Arg Ser Glu Leu Arg 85 90 95 Tyr Ala Leu Glu His Arg Lys Met Gly Gly Val Val Cys Val Thr Val 100 105 110 Asp Asn Val Glu Glu Leu Glu Lys Ile Ala Lys Leu Ala Pro Glu Ala 115 120 125 Gly Val Lys Pro Arg Leu Leu Leu Arg Val Lys Pro Asp Val Asp Ala 130 135 140 His Ala His Cys Arg Leu Ser Thr Gly Gln Glu Asp Ser Lys Phe Gly 145 150 155 160 Ala Asp Leu Glu Asp Gly Glu Asp Ala Glu Ala Leu Leu Lys Ala Ala 165 170 175 Lys Glu Leu Gly Asn Leu Asn Val Val Gly Val His Phe His Val Gly 180 185 190 Ser Gly Ile Ser Asp Leu Glu Ala Phe Val Lys Ala Val Arg Asp Ala 195 200 205 Arg Asn Val Phe Asp Gln Gly Ala Asp Glu Leu Gly Phe Lys Thr Ile 210 215 220 Asp Leu Lys Ile Leu Asp Ile Gly Gly Gly Phe Gly Val Asp Tyr Thr 225 230 235 240 Gly Thr Arg Ser Gln Ser Asp Met Ser Val Ala Glu Asp Phe Glu Glu 245 250 255 Ile Ala Glu Val Ile Asn Ala Ala Leu Glu Glu Leu Phe Pro His Ala 260 265 270 Gly Tyr Gly Asp Pro Gly Pro Thr Ile Ile Ala Glu Pro Gly Arg Tyr 275 280 285 Ile Val Ala Ala Ala Gly Thr Leu Val Ser Asn Val Ile Ala Lys Lys 290 295 300 Glu Val Pro Ser Asp Asp Ala Asp Thr Thr Ser Asp Ser Leu Arg Glu 305 310 315 320 Glu Ser Lys Asp Asp Thr Arg Met Tyr Tyr Val Asn Asp Gly Gly Tyr 325 330 335 Gly Ser Phe Ile Arg Pro Leu Leu Tyr His Ala His Pro Glu Ala Leu 340 345 350 Leu Leu Arg Arg Gly Gly Glu Val Gln Tyr Gln Asp Ala Glu Thr Glu 355 360 365 Arg Ala Ala Asp Lys Ser Leu Ser Asn Phe Ser Leu Phe Gln Ser Tyr 370 375 380 Pro Asp Ala Trp Gly Ile Asp Gln Leu Phe Pro Val Leu Pro Leu Arg 385 390 395 400 Ser Leu Asp Glu Glu Pro Lys Arg Lys Ser Ser Ile Val Gly Pro Thr 405 410 415 Cys Asp Ser Asp Gly Lys Leu Asp Lys Ile Ile Lys Asp Asp Gly Ile 420 425 430 Ala Glu Asp Arg Leu Leu Pro Glu Leu Lys Pro Val Gly Asp Trp Leu 435 440 445 Ala Phe Pro Asp Thr Gly Ala Tyr Thr Tyr Ala Met Ala Ser Asn Tyr 450 455 460 Asn Gly Phe 465 14 19 PRT Artificial Sequence binding site 14 Xaa Xaa Xaa Lys Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 1 5 10 15 Xaa Xaa Xaa 15 14 PRT Artificial Sequence signature sequence 15 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Gly Gly Gly Xaa Xaa 1 5 10 16 1902 DNA Homo sapiens CDS (193)...(1401) 16 cgacccacgc gtccgcgtcg gagctcctgc agaccagtgc gcgctcgggg agttggcgag 60 cgggtggcgg ctgggagacg tcccgagcgc acgggactga caggcggcag aagccgggcg 120 gggtccgctg ggctccggac ccgtgcccac ccagttccag ggcggccccg ggcggccccg 180 ccccctcggt ga atg ccg cgg gcc ggc caa tcc ggg cag gcc gcg gcg ccg 231 Met Pro Arg Ala Gly Gln Ser Gly Gln Ala Ala Ala Pro 1 5 10 cgc agc cta tca gcg gcc aga gct cgc gtg cgc ttc cgc gtt cgc gtg 279 Arg Ser Leu Ser Ala Ala Arg Ala Arg Val Arg Phe Arg Val Arg Val 15 20 25 cgc ttc cgc gtt ctc gtg agc tcc cgg ccc gct gcc gca ggg act ggg 327 Arg Phe Arg Val Leu Val Ser Ser Arg Pro Ala Ala Ala Gly Thr Gly 30 35 40 45 agc ggt ctc cgc agg gac tgg gag cgg gct ccg cag cgc act cta gcc 375 Ser Gly Leu Arg Arg Asp Trp Glu Arg Ala Pro Gln Arg Thr Leu Ala 50 55 60 cgc ggc tcg gct cag tcg gtc tgc gag gat ccg gcc cgc cgc ccc ccg 423 Arg Gly Ser Ala Gln Ser Val Cys Glu Asp Pro Ala Arg Arg Pro Pro 65 70 75 ggg gac ccg atg gcc tcg gag ggc ctg gcg ggg gcg ctg gct tcc gtg 471 Gly Asp Pro Met Ala Ser Glu Gly Leu Ala Gly Ala Leu Ala Ser Val 80 85 90 ctg gct ggc cag ggg tcc agc gtg cac agc tgc gac tcg gcg ccg gcc 519 Leu Ala Gly Gln Gly Ser Ser Val His Ser Cys Asp Ser Ala Pro Ala 95 100 105 ggg gag ccg ccg gcg ccc gtg cgg ctg cgg aag aac gtg tgc tac gtg 567 Gly Glu Pro Pro Ala Pro Val Arg Leu Arg Lys Asn Val Cys Tyr Val 110 115 120 125 gtg ctg gcc gtg ttc ctc agc gag cag gat gag gtg cta ctg atc cag 615 Val Leu Ala Val Phe Leu Ser Glu Gln Asp Glu Val Leu Leu Ile Gln 130 135 140 gag gcc aag agg gag tgc cgg ggg tcg tgg tac ctg cct gcg ggg aga 663 Glu Ala Lys Arg Glu Cys Arg Gly Ser Trp Tyr Leu Pro Ala Gly Arg 145 150 155 atg gag cca ggg gag acc atc gtg gag gcg ctg cag cgg gag gtg aag 711 Met Glu Pro Gly Glu Thr Ile Val Glu Ala Leu Gln Arg Glu Val Lys 160 165 170 gag gag gcg ggg ctg cac tgt gag ccc gag aca ctg ctg tcc gtg gag 759 Glu Glu Ala Gly Leu His Cys Glu Pro Glu Thr Leu Leu Ser Val Glu 175 180 185 gag cgg ggc ccc tcc tgg gtc cgc ttc gtg ttc ctc gct cgc ccc aca 807 Glu Arg Gly Pro Ser Trp Val Arg Phe Val Phe Leu Ala Arg Pro Thr 190 195 200 205 ggt gga att ctc aag act tcc aag gag gcc gat gcg gag tcc ctg cag 855 Gly Gly Ile Leu Lys Thr Ser Lys Glu Ala Asp Ala Glu Ser Leu Gln 210 215 220 gct gcc tgg tac cca cgg acc tcc ctg ccc act ccg ctg cga gcc cat 903 Ala Ala Trp Tyr Pro Arg Thr Ser Leu Pro Thr Pro Leu Arg Ala His 225 230 235 gac atc ctg cac ctg gtt gaa cta gcc gcc cag tat cgc cag caa gcc 951 Asp Ile Leu His Leu Val Glu Leu Ala Ala Gln Tyr Arg Gln Gln Ala 240 245 250 agg cac cct ctc att ctg ccc caa gag cta ccc tgt gat ctg gtc tgc 999 Arg His Pro Leu Ile Leu Pro Gln Glu Leu Pro Cys Asp Leu Val Cys 255 260 265 cag cgg ctc gtg gct acc ttt acc agc gcc cag aca gtg tgg gtg tta 1047 Gln Arg Leu Val Ala Thr Phe Thr Ser Ala Gln Thr Val Trp Val Leu 270 275 280 285 gtg ggc aca gtg ggg atg cct cac ttg cct gtc act gcc tgt ggc ctc 1095 Val Gly Thr Val Gly Met Pro His Leu Pro Val Thr Ala Cys Gly Leu 290 295 300 gac cct atg gag cag agg ggt ggc atg aag atg gcc gtc ctg cgg ctg 1143 Asp Pro Met Glu Gln Arg Gly Gly Met Lys Met Ala Val Leu Arg Leu 305 310 315 ctg cag gag tgt ctg acc ctg cac cac ttg gtg gtg gag atc aag ggg 1191 Leu Gln Glu Cys Leu Thr Leu His His Leu Val Val Glu Ile Lys Gly 320 325 330 ttg ctt gga ctg cag cac ctg ggc cga gat cac agt gat ggc atc tgt 1239 Leu Leu Gly Leu Gln His Leu Gly Arg Asp His Ser Asp Gly Ile Cys 335 340 345 ttg aat gtg ctg gtg acc gtg gct ttt cgg agc cca ggg atc cag gat 1287 Leu Asn Val Leu Val Thr Val Ala Phe Arg Ser Pro Gly Ile Gln Asp 350 355 360 365 gaa ccc cca aaa gtt cgg ggt gag aac ttc tct tgg tgg aag gtg atg 1335 Glu Pro Pro Lys Val Arg Gly Glu Asn Phe Ser Trp Trp Lys Val Met 370 375 380 gag gaa gac ctg caa agc cag ctc ctc cag cgg ctt cag gga tcc tct 1383 Glu Glu Asp Leu Gln Ser Gln Leu Leu Gln Arg Leu Gln Gly Ser Ser 385 390 395 gtt gtc cca gtg aac aga tagagaggtg gaggaggtga cagggagcta 1431 Val Val Pro Val Asn Arg 400 ggcagccgtg ctccctccag tgcggacttg tctccctctg agggaggcaa gaggctggcg 1491 atcagggatc ttgttgcatt gggagcaggg gcggctctcc tggtccccag gagagatgct 1551 ttgaggagca ttcctctaga ttgcacaagg gacagtgcct ttaaccaagc gaggagtcca 1611 aagctcagga cctgactacc ctgagggcac gctgacgcct ctccccaggg ggatggggag 1671 ctttctgcac ccccagtggc atctcctcat cacgttctgt gccgtccttg ggaaaggcct 1731 gcattctgat ccttccaggc ccttcgagca tggaggggca ctggggaagg tcccccgagg 1791 gaggagcacg ttgctgagta aagaggtgtt actcammata aaaaaaaaaa aaaaaaaaaa 1851 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa agggcggccg ctagactagt c 1902 17 403 PRT Homo sapiens 17 Met Pro Arg Ala Gly Gln Ser Gly Gln Ala Ala Ala Pro Arg Ser Leu 1 5 10 15 Ser Ala Ala Arg Ala Arg Val Arg Phe Arg Val Arg Val Arg Phe Arg 20 25 30 Val Leu Val Ser Ser Arg Pro Ala Ala Ala Gly Thr Gly Ser Gly Leu 35 40 45 Arg Arg Asp Trp Glu Arg Ala Pro Gln Arg Thr Leu Ala Arg Gly Ser 50 55 60 Ala Gln Ser Val Cys Glu Asp Pro Ala Arg Arg Pro Pro Gly Asp Pro 65 70 75 80 Met Ala Ser Glu Gly Leu Ala Gly Ala Leu Ala Ser Val Leu Ala Gly 85 90 95 Gln Gly Ser Ser Val His Ser Cys Asp Ser Ala Pro Ala Gly Glu Pro 100 105 110 Pro Ala Pro Val Arg Leu Arg Lys Asn Val Cys Tyr Val Val Leu Ala 115 120 125 Val Phe Leu Ser Glu Gln Asp Glu Val Leu Leu Ile Gln Glu Ala Lys 130 135 140 Arg Glu Cys Arg Gly Ser Trp Tyr Leu Pro Ala Gly Arg Met Glu Pro 145 150 155 160 Gly Glu Thr Ile Val Glu Ala Leu Gln Arg Glu Val Lys Glu Glu Ala 165 170 175 Gly Leu His Cys Glu Pro Glu Thr Leu Leu Ser Val Glu Glu Arg Gly 180 185 190 Pro Ser Trp Val Arg Phe Val Phe Leu Ala Arg Pro Thr Gly Gly Ile 195 200 205 Leu Lys Thr Ser Lys Glu Ala Asp Ala Glu Ser Leu Gln Ala Ala Trp 210 215 220 Tyr Pro Arg Thr Ser Leu Pro Thr Pro Leu Arg Ala His Asp Ile Leu 225 230 235 240 His Leu Val Glu Leu Ala Ala Gln Tyr Arg Gln Gln Ala Arg His Pro 245 250 255 Leu Ile Leu Pro Gln Glu Leu Pro Cys Asp Leu Val Cys Gln Arg Leu 260 265 270 Val Ala Thr Phe Thr Ser Ala Gln Thr Val Trp Val Leu Val Gly Thr 275 280 285 Val Gly Met Pro His Leu Pro Val Thr Ala Cys Gly Leu Asp Pro Met 290 295 300 Glu Gln Arg Gly Gly Met Lys Met Ala Val Leu Arg Leu Leu Gln Glu 305 310 315 320 Cys Leu Thr Leu His His Leu Val Val Glu Ile Lys Gly Leu Leu Gly 325 330 335 Leu Gln His Leu Gly Arg Asp His Ser Asp Gly Ile Cys Leu Asn Val 340 345 350 Leu Val Thr Val Ala Phe Arg Ser Pro Gly Ile Gln Asp Glu Pro Pro 355 360 365 Lys Val Arg Gly Glu Asn Phe Ser Trp Trp Lys Val Met Glu Glu Asp 370 375 380 Leu Gln Ser Gln Leu Leu Gln Arg Leu Gln Gly Ser Ser Val Val Pro 385 390 395 400 Val Asn Arg 18 1212 DNA Homo sapiens 18 atgccgcggg ccggccaatc cgggcaggcc gcggcgccgc gcagcctatc agcggccaga 60 gctcgcgtgc gcttccgcgt tcgcgtgcgc ttccgcgttc tcgtgagctc ccggcccgct 120 gccgcaggga ctgggagcgg tctccgcagg gactgggagc gggctccgca gcgcactcta 180 gcccgcggct cggctcagtc ggtctgcgag gatccggccc gccgcccccc gggggacccg 240 atggcctcgg agggcctggc gggggcgctg gcttccgtgc tggctggcca ggggtccagc 300 gtgcacagct gcgactcggc gccggccggg gagccgccgg cgcccgtgcg gctgcggaag 360 aacgtgtgct acgtggtgct ggccgtgttc ctcagcgagc aggatgaggt gctactgatc 420 caggaggcca agagggagtg ccgggggtcg tggtacctgc ctgcggggag aatggagcca 480 ggggagacca tcgtggaggc gctgcagcgg gaggtgaagg aggaggcggg gctgcactgt 540 gagcccgaga cactgctgtc cgtggaggag cggggcccct cctgggtccg cttcgtgttc 600 ctcgctcgcc ccacaggtgg aattctcaag acttccaagg aggccgatgc ggagtccctg 660 caggctgcct ggtacccacg gacctccctg cccactccgc tgcgagccca tgacatcctg 720 cacctggttg aactagccgc ccagtatcgc cagcaagcca ggcaccctct cattctgccc 780 caagagctac cctgtgatct ggtctgccag cggctcgtgg ctacctttac cagcgcccag 840 acagtgtggg tgttagtggg cacagtgggg atgcctcact tgcctgtcac tgcctgtggc 900 ctcgacccta tggagcagag gggtggcatg aagatggccg tcctgcggct gctgcaggag 960 tgtctgaccc tgcaccactt ggtggtggag atcaaggggt tgcttggact gcagcacctg 1020 ggccgagatc acagtgatgg catctgtttg aatgtgctgg tgaccgtggc ttttcggagc 1080 ccagggatcc aggatgaacc cccaaaagtt cggggtgaga acttctcttg gtggaaggtg 1140 atggaggaag acctgcaaag ccagctcctc cagcggcttc agggatcctc tgttgtccca 1200 gtgaacagat ag 1212 19 134 PRT Artificial Sequence consensus sequence 19 Arg Arg Leu Ala Val Gly Val Val Leu Phe Asn Glu Asp Gly Glu Val 1 5 10 15 Leu Leu Val Arg Arg Ser Arg Pro Pro Pro Gly Leu Trp Glu Phe Pro 20 25 30 Gly Gly Lys Val Glu Pro Gly Glu Thr Pro Glu Glu Ala Ala Val Arg 35 40 45 Glu Leu Lys Glu Glu Thr Gly Ile Asp Val Ser Asp Ser Ala Glu Glu 50 55 60 Leu Leu Leu Leu Leu Gly Val Val Glu Tyr Pro Ala Pro Gly Arg Asp 65 70 75 80 Lys Val His Tyr Phe Leu Ala Glu Val Leu Gly Gly Glu Leu Pro Gln 85 90 95 Leu Pro Gly Thr Glu Val Ala Glu Val Arg Trp Val Ser Leu Glu Glu 100 105 110 Leu Pro Leu Leu Leu Leu Ala Gly Ser Ile Arg Asp Ala Lys Leu Ile 115 120 125 Ala Asp Leu Leu Ala Leu 130 20 30 PRT Artificial Sequence consensus sequence 20 Gly Xaa Xaa Xaa Xaa Xaa Glu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Arg Glu 1 5 10 15 Xaa Xaa Glu Glu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 20 25 30 21 16 PRT Artificial Sequence exemplary motif 21 Glu Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Arg Glu Leu Xaa Glu Glu 1 5 10 15 22 2798 DNA Homo sapiens CDS (108)...(2621) 22 tactataggg agtcgcccac gcgtccgggc agcggttgtg aggagttagc tcgcggcatt 60 gcaggctctg agaggagggg acccggttcc cgggtgagtg tccaggc atg cca gcg 116 Met Pro Ala 1 gaa cgg ccc gcg ggc agc ggc ggc tcg gag gct cca gca atg gtt gaa 164 Glu Arg Pro Ala Gly Ser Gly Gly Ser Glu Ala Pro Ala Met Val Glu 5 10 15 caa ctg gac act gct gtg att acc ccg gcc atg cta gaa gag gaa gaa 212 Gln Leu Asp Thr Ala Val Ile Thr Pro Ala Met Leu Glu Glu Glu Glu 20 25 30 35 cag ctt gaa gct gct gga cta gag aga gag cgg aag atg ctg gaa aag 260 Gln Leu Glu Ala Ala Gly Leu Glu Arg Glu Arg Lys Met Leu Glu Lys 40 45 50 gct cgc atg tct tgg gat aga gag tcg aca gaa att cgg tac cgt aga 308 Ala Arg Met Ser Trp Asp Arg Glu Ser Thr Glu Ile Arg Tyr Arg Arg 55 60 65 ctt caa cat ttg ctt gaa aaa agc aat ata tac tcc aaa ttt tta ttg 356 Leu Gln His Leu Leu Glu Lys Ser Asn Ile Tyr Ser Lys Phe Leu Leu 70 75 80 acg aaa atg gaa cag caa caa tta gag gaa cag aag aag aaa gaa aaa 404 Thr Lys Met Glu Gln Gln Gln Leu Glu Glu Gln Lys Lys Lys Glu Lys 85 90 95 ttg gag aga aaa aag gag tct tta aaa gtt aaa aag ggt aaa aat tca 452 Leu Glu Arg Lys Lys Glu Ser Leu Lys Val Lys Lys Gly Lys Asn Ser 100 105 110 115 att gat gca agt gaa gag aag cca gtt atg agg aaa aaa aga gga aga 500 Ile Asp Ala Ser Glu Glu Lys Pro Val Met Arg Lys Lys Arg Gly Arg 120 125 130 gaa gat gaa tca tac aat att tca gag gtc atg tca aaa gag gaa att 548 Glu Asp Glu Ser Tyr Asn Ile Ser Glu Val Met Ser Lys Glu Glu Ile 135 140 145 ttg tct gtg gct aaa aaa aat aaa aag gag aat gag gat gaa aac tcc 596 Leu Ser Val Ala Lys Lys Asn Lys Lys Glu Asn Glu Asp Glu Asn Ser 150 155 160 tcc tct act aat ctc tgt gtg gaa gat ctt cag aaa aat aaa gat tcg 644 Ser Ser Thr Asn Leu Cys Val Glu Asp Leu Gln Lys Asn Lys Asp Ser 165 170 175 aat agt ata att aaa gat aga ttg tct gaa acg gtt agg cag aat act 692 Asn Ser Ile Ile Lys Asp Arg Leu Ser Glu Thr Val Arg Gln Asn Thr 180 185 190 195 aaa ttc ttt ttt gac cca gtc cgg aag tgt aat ggt cag cca gta cct 740 Lys Phe Phe Phe Asp Pro Val Arg Lys Cys Asn Gly Gln Pro Val Pro 200 205 210 ttt caa caa cca aag cac ttc act gga gga gtg atg cga tgg tac caa 788 Phe Gln Gln Pro Lys His Phe Thr Gly Gly Val Met Arg Trp Tyr Gln 215 220 225 gta gaa ggc atg gaa tgg ctt agg atg ctt tgg gaa aat gga att aat 836 Val Glu Gly Met Glu Trp Leu Arg Met Leu Trp Glu Asn Gly Ile Asn 230 235 240 ggc att tta gca gat gaa atg gga ttg ggt aag aca gtt cag tgc att 884 Gly Ile Leu Ala Asp Glu Met Gly Leu Gly Lys Thr Val Gln Cys Ile 245 250 255 gct act att gca ttg atg att cag aga gga gta cca gga cct ttt ctt 932 Ala Thr Ile Ala Leu Met Ile Gln Arg Gly Val Pro Gly Pro Phe Leu 260 265 270 275 gtc tgt ggc cct ttg tct aca ctt cct aac tgg atg gct gaa ttc aaa 980 Val Cys Gly Pro Leu Ser Thr Leu Pro Asn Trp Met Ala Glu Phe Lys 280 285 290 aga ttt aca cca gat atc cct aca atg tta tat cat gga acc cag gag 1028 Arg Phe Thr Pro Asp Ile Pro Thr Met Leu Tyr His Gly Thr Gln Glu 295 300 305 gac cgt cga aaa ttg gta aga aat att tac aaa aga caa ggg aca ctg 1076 Asp Arg Arg Lys Leu Val Arg Asn Ile Tyr Lys Arg Gln Gly Thr Leu 310 315 320 cag att cat cct gtg gtg gtc aca tca ttc gag atc gct atg cga gac 1124 Gln Ile His Pro Val Val Val Thr Ser Phe Glu Ile Ala Met Arg Asp 325 330 335 cag aat gct tta cag cat tgc tat tgg aaa tac tta ata gta gat gaa 1172 Gln Asn Ala Leu Gln His Cys Tyr Trp Lys Tyr Leu Ile Val Asp Glu 340 345 350 355 gga cac agg att aag aat atg aag tgc cgt cta atc agg gag tta aaa 1220 Gly His Arg Ile Lys Asn Met Lys Cys Arg Leu Ile Arg Glu Leu Lys 360 365 370 cga ttc aat gct gat aac aaa ctt ctt ttg act ggt act ccc ttg caa 1268 Arg Phe Asn Ala Asp Asn Lys Leu Leu Leu Thr Gly Thr Pro Leu Gln 375 380 385 aac aat tta tca gaa ctt tgg tca ttg cta aac ttt ttg ttg cca gat 1316 Asn Asn Leu Ser Glu Leu Trp Ser Leu Leu Asn Phe Leu Leu Pro Asp 390 395 400 gta ttt gat gac ttg aaa agc ttt gag tct tgg ttt gac atc act agt 1364 Val Phe Asp Asp Leu Lys Ser Phe Glu Ser Trp Phe Asp Ile Thr Ser 405 410 415 ctt tct gaa act gct gaa gat att att gct aaa gaa aga gaa cag aat 1412 Leu Ser Glu Thr Ala Glu Asp Ile Ile Ala Lys Glu Arg Glu Gln Asn 420 425 430 435 gta ttg cat atg ctg cac cag att tta aca cct ttc tta ttg aga aga 1460 Val Leu His Met Leu His Gln Ile Leu Thr Pro Phe Leu Leu Arg Arg 440 445 450 ctg aag tct gat gtt gct ctt gaa gtt cct cct aaa cga gaa gta gtc 1508 Leu Lys Ser Asp Val Ala Leu Glu Val Pro Pro Lys Arg Glu Val Val 455 460 465 gtt tat gct cca ctt tca aag aag cag gag atc ttt tat aca gcc att 1556 Val Tyr Ala Pro Leu Ser Lys Lys Gln Glu Ile Phe Tyr Thr Ala Ile 470 475 480 gtg aac cgt aca att gca aac atg ttt gga tcc agt gag aaa gaa aca 1604 Val Asn Arg Thr Ile Ala Asn Met Phe Gly Ser Ser Glu Lys Glu Thr 485 490 495 att gag tta agt cct act ggt cga cca aaa cga cga act aga aaa tca 1652 Ile Glu Leu Ser Pro Thr Gly Arg Pro Lys Arg Arg Thr Arg Lys Ser 500 505 510 515 ata aat tac agc aaa ata gat gat ttc cct aat gaa ttg gaa aaa ctg 1700 Ile Asn Tyr Ser Lys Ile Asp Asp Phe Pro Asn Glu Leu Glu Lys Leu 520 525 530 atc agt caa ata cag cca gag gtg gac cga gaa aga gct gtt gtg gaa 1748 Ile Ser Gln Ile Gln Pro Glu Val Asp Arg Glu Arg Ala Val Val Glu 535 540 545 gtg aat atc cct gta gaa tct gaa gtt aat ctg aag ctg cag aat ata 1796 Val Asn Ile Pro Val Glu Ser Glu Val Asn Leu Lys Leu Gln Asn Ile 550 555 560 atg atg cta ctt cgt aaa tgt tgt aat cat cca tat ttg att gaa tat 1844 Met Met Leu Leu Arg Lys Cys Cys Asn His Pro Tyr Leu Ile Glu Tyr 565 570 575 cct ata gac cct gtt aca caa gaa ttt aag atc gat gaa gaa ttg gta 1892 Pro Ile Asp Pro Val Thr Gln Glu Phe Lys Ile Asp Glu Glu Leu Val 580 585 590 595 aca aat tct ggg aag ttc ttg att ttg gat cga atg ctg cca gaa cta 1940 Thr Asn Ser Gly Lys Phe Leu Ile Leu Asp Arg Met Leu Pro Glu Leu 600 605 610 aaa aaa aga ggt cac aag gtg ctg ctt ttt tca caa atg aca agc atg 1988 Lys Lys Arg Gly His Lys Val Leu Leu Phe Ser Gln Met Thr Ser Met 615 620 625 ttg gac att ttg atg gat tac tgc cat ctc aga gat ttc aac ttc agc 2036 Leu Asp Ile Leu Met Asp Tyr Cys His Leu Arg Asp Phe Asn Phe Ser 630 635 640 agg ctt gat ggg tcc atg tct tac tca gag aga gaa aaa aac atg cac 2084 Arg Leu Asp Gly Ser Met Ser Tyr Ser Glu Arg Glu Lys Asn Met His 645 650 655 agc ttc aac acg gat cca gag gtg ttt atc ttc tta gtg agt aca cga 2132 Ser Phe Asn Thr Asp Pro Glu Val Phe Ile Phe Leu Val Ser Thr Arg 660 665 670 675 gct ggt ggc ctg ggc att aat ctg act gca gca gat aca gtt atc att 2180 Ala Gly Gly Leu Gly Ile Asn Leu Thr Ala Ala Asp Thr Val Ile Ile 680 685 690 tat gat agt gat tgg aac ccc cag tcg gat ctt cag gcc cag gat aga 2228 Tyr Asp Ser Asp Trp Asn Pro Gln Ser Asp Leu Gln Ala Gln Asp Arg 695 700 705 tgt cat aga att ggt cag aca aag cca gtt gtt gtt tat cgc ctt gtt 2276 Cys His Arg Ile Gly Gln Thr Lys Pro Val Val Val Tyr Arg Leu Val 710 715 720 aca gca aat act atc gat cag aaa att gtg gaa aga gca gct gct aaa 2324 Thr Ala Asn Thr Ile Asp Gln Lys Ile Val Glu Arg Ala Ala Ala Lys 725 730 735 agg aaa ctg gaa aag ttg atc atc cat aaa aat cat ttc aaa ggt ggt 2372 Arg Lys Leu Glu Lys Leu Ile Ile His Lys Asn His Phe Lys Gly Gly 740 745 750 755 cag tct gga tta aat ctg tct aag aat ttc tta gat cct aag gaa tta 2420 Gln Ser Gly Leu Asn Leu Ser Lys Asn Phe Leu Asp Pro Lys Glu Leu 760 765 770 atg gaa tta tta aaa tct aga gat tat gaa agg gaa ata aaa gga tca 2468 Met Glu Leu Leu Lys Ser Arg Asp Tyr Glu Arg Glu Ile Lys Gly Ser 775 780 785 aga gag aag gtc att agt gat aaa gat cta gag ttg ttg tta gat cga 2516 Arg Glu Lys Val Ile Ser Asp Lys Asp Leu Glu Leu Leu Leu Asp Arg 790 795 800 agt gat ctt att gat caa atg aat gct tca gga cca att aaa gag aag 2564 Ser Asp Leu Ile Asp Gln Met Asn Ala Ser Gly Pro Ile Lys Glu Lys 805 810 815 atg ggg ata ttc aag ata tta gaa aat tct gaa gat tcc agt cct gaa 2612 Met Gly Ile Phe Lys Ile Leu Glu Asn Ser Glu Asp Ser Ser Pro Glu 820 825 830 835 tgt ttg ttt taaagtggag ctcaagaata gcttttaaaa gttcttattt 2661 Cys Leu Phe acatctagtg atttccctgt attgggtttg aaatactgat tgtccacttc acctttttta 2721 ttatatcagt tgacatgtaa ctagtaccat gccgtacctt aaatagatgg taattttctg 2781 agcctttccc aagaaca 2798 23 838 PRT Homo sapiens 23 Met Pro Ala Glu Arg Pro Ala Gly Ser Gly Gly Ser Glu Ala Pro Ala 1 5 10 15 Met Val Glu Gln Leu Asp Thr Ala Val Ile Thr Pro Ala Met Leu Glu 20 25 30 Glu Glu Glu Gln Leu Glu Ala Ala Gly Leu Glu Arg Glu Arg Lys Met 35 40 45 Leu Glu Lys Ala Arg Met Ser Trp Asp Arg Glu Ser Thr Glu Ile Arg 50 55 60 Tyr Arg Arg Leu Gln His Leu Leu Glu Lys Ser Asn Ile Tyr Ser Lys 65 70 75 80 Phe Leu Leu Thr Lys Met Glu Gln Gln Gln Leu Glu Glu Gln Lys Lys 85 90 95 Lys Glu Lys Leu Glu Arg Lys Lys Glu Ser Leu Lys Val Lys Lys Gly 100 105 110 Lys Asn Ser Ile Asp Ala Ser Glu Glu Lys Pro Val Met Arg Lys Lys 115 120 125 Arg Gly Arg Glu Asp Glu Ser Tyr Asn Ile Ser Glu Val Met Ser Lys 130 135 140 Glu Glu Ile Leu Ser Val Ala Lys Lys Asn Lys Lys Glu Asn Glu Asp 145 150 155 160 Glu Asn Ser Ser Ser Thr Asn Leu Cys Val Glu Asp Leu Gln Lys Asn 165 170 175 Lys Asp Ser Asn Ser Ile Ile Lys Asp Arg Leu Ser Glu Thr Val Arg 180 185 190 Gln Asn Thr Lys Phe Phe Phe Asp Pro Val Arg Lys Cys Asn Gly Gln 195 200 205 Pro Val Pro Phe Gln Gln Pro Lys His Phe Thr Gly Gly Val Met Arg 210 215 220 Trp Tyr Gln Val Glu Gly Met Glu Trp Leu Arg Met Leu Trp Glu Asn 225 230 235 240 Gly Ile Asn Gly Ile Leu Ala Asp Glu Met Gly Leu Gly Lys Thr Val 245 250 255 Gln Cys Ile Ala Thr Ile Ala Leu Met Ile Gln Arg Gly Val Pro Gly 260 265 270 Pro Phe Leu Val Cys Gly Pro Leu Ser Thr Leu Pro Asn Trp Met Ala 275 280 285 Glu Phe Lys Arg Phe Thr Pro Asp Ile Pro Thr Met Leu Tyr His Gly 290 295 300 Thr Gln Glu Asp Arg Arg Lys Leu Val Arg Asn Ile Tyr Lys Arg Gln 305 310 315 320 Gly Thr Leu Gln Ile His Pro Val Val Val Thr Ser Phe Glu Ile Ala 325 330 335 Met Arg Asp Gln Asn Ala Leu Gln His Cys Tyr Trp Lys Tyr Leu Ile 340 345 350 Val Asp Glu Gly His Arg Ile Lys Asn Met Lys Cys Arg Leu Ile Arg 355 360 365 Glu Leu Lys Arg Phe Asn Ala Asp Asn Lys Leu Leu Leu Thr Gly Thr 370 375 380 Pro Leu Gln Asn Asn Leu Ser Glu Leu Trp Ser Leu Leu Asn Phe Leu 385 390 395 400 Leu Pro Asp Val Phe Asp Asp Leu Lys Ser Phe Glu Ser Trp Phe Asp 405 410 415 Ile Thr Ser Leu Ser Glu Thr Ala Glu Asp Ile Ile Ala Lys Glu Arg 420 425 430 Glu Gln Asn Val Leu His Met Leu His Gln Ile Leu Thr Pro Phe Leu 435 440 445 Leu Arg Arg Leu Lys Ser Asp Val Ala Leu Glu Val Pro Pro Lys Arg 450 455 460 Glu Val Val Val Tyr Ala Pro Leu Ser Lys Lys Gln Glu Ile Phe Tyr 465 470 475 480 Thr Ala Ile Val Asn Arg Thr Ile Ala Asn Met Phe Gly Ser Ser Glu 485 490 495 Lys Glu Thr Ile Glu Leu Ser Pro Thr Gly Arg Pro Lys Arg Arg Thr 500 505 510 Arg Lys Ser Ile Asn Tyr Ser Lys Ile Asp Asp Phe Pro Asn Glu Leu 515 520 525 Glu Lys Leu Ile Ser Gln Ile Gln Pro Glu Val Asp Arg Glu Arg Ala 530 535 540 Val Val Glu Val Asn Ile Pro Val Glu Ser Glu Val Asn Leu Lys Leu 545 550 555 560 Gln Asn Ile Met Met Leu Leu Arg Lys Cys Cys Asn His Pro Tyr Leu 565 570 575 Ile Glu Tyr Pro Ile Asp Pro Val Thr Gln Glu Phe Lys Ile Asp Glu 580 585 590 Glu Leu Val Thr Asn Ser Gly Lys Phe Leu Ile Leu Asp Arg Met Leu 595 600 605 Pro Glu Leu Lys Lys Arg Gly His Lys Val Leu Leu Phe Ser Gln Met 610 615 620 Thr Ser Met Leu Asp Ile Leu Met Asp Tyr Cys His Leu Arg Asp Phe 625 630 635 640 Asn Phe Ser Arg Leu Asp Gly Ser Met Ser Tyr Ser Glu Arg Glu Lys 645 650 655 Asn Met His Ser Phe Asn Thr Asp Pro Glu Val Phe Ile Phe Leu Val 660 665 670 Ser Thr Arg Ala Gly Gly Leu Gly Ile Asn Leu Thr Ala Ala Asp Thr 675 680 685 Val Ile Ile Tyr Asp Ser Asp Trp Asn Pro Gln Ser Asp Leu Gln Ala 690 695 700 Gln Asp Arg Cys His Arg Ile Gly Gln Thr Lys Pro Val Val Val Tyr 705 710 715 720 Arg Leu Val Thr Ala Asn Thr Ile Asp Gln Lys Ile Val Glu Arg Ala 725 730 735 Ala Ala Lys Arg Lys Leu Glu Lys Leu Ile Ile His Lys Asn His Phe 740 745 750 Lys Gly Gly Gln Ser Gly Leu Asn Leu Ser Lys Asn Phe Leu Asp Pro 755 760 765 Lys Glu Leu Met Glu Leu Leu Lys Ser Arg Asp Tyr Glu Arg Glu Ile 770 775 780 Lys Gly Ser Arg Glu Lys Val Ile Ser Asp Lys Asp Leu Glu Leu Leu 785 790 795 800 Leu Asp Arg Ser Asp Leu Ile Asp Gln Met Asn Ala Ser Gly Pro Ile 805 810 815 Lys Glu Lys Met Gly Ile Phe Lys Ile Leu Glu Asn Ser Glu Asp Ser 820 825 830 Ser Pro Glu Cys Leu Phe 835 24 2517 DNA Homo sapiens 24 atgccagcgg aacggcccgc gggcagcggc ggctcggagg ctccagcaat ggttgaacaa 60 ctggacactg ctgtgattac cccggccatg ctagaagagg aagaacagct tgaagctgct 120 ggactagaga gagagcggaa gatgctggaa aaggctcgca tgtcttggga tagagagtcg 180 acagaaattc ggtaccgtag acttcaacat ttgcttgaaa aaagcaatat mtactccaaa 240 tttttattga cgaaaatgga acagcaacaa ttagaggaac agaagaagaa agaaaaattg 300 gagagaaaaa aggagtcttt aaaagttaaa aagggtaaaa attcaattga tgcaagtgaa 360 gagaagccag ttatgaggaa aaaaagagga agagaagatg aatcatacaa tatttcagag 420 gtcatgtcaa aagaggaaat tttgtctgtg gctaaaaaaa ataaaaagga gaatgaggat 480 gaaaactcct cctctactaa tctctgtgtg gaagatcttc agaaaaataa agattcgaat 540 agtataatta aagatagatt gtctgaaacg gttaggcaga atactaaatt cttttttgac 600 ccagtccgga agtgtaatgg tcagccagta ccttttcaac aaccaaagca cttcactgga 660 ggagtgatgc gatggtacca agtagaaggc atggaatggc ttaggatgct ttgggaaaat 720 ggaattaatg gcattttagc agatgaaatg ggattgggta agacagttca gtgcattgct 780 actattgcat tgatgattca gagaggagta ccaggacctt ttcttgtctg tggccctttg 840 tctacacttc ctaactggat ggctgaattc aaaagattta caccagatat ccctacaatg 900 ttatatcatg gaacccagga ggaccgtcga aaattggtaa gaaatattta caaaagacaa 960 gggacactgc agattcatcc tgtggtggtc acatcattcg agatcgctat gcgagaccag 1020 aatgctttac agcattgcta ttggaaatac ttaatagtag atgaaggaca caggattaag 1080 aatatgaagt gccgtctaat cagggagtta aaacgattca atgctgataa caaacttctt 1140 ttgactggta ctcccttgca aaacaattta tcagaacttt ggtcattgct aaactttttg 1200 ttgccagatg tatttgatga cttgaaaagc tttgagtctt ggtttgacat cactagtctt 1260 tctgaaactg ctgaagatat tattgctaaa gaaagagaac agaatgtatt gcatatgctg 1320 caccagattt taacaccttt cttattgaga agactgaagt ctgatgttgc tcttgaagtt 1380 cctcctaaac gagaagtagt cgtttatgct ccactttcaa agaagcagga gatcttttat 1440 acagccattg tgaaccgtac aattgcaaac atgtttggat ccagtgagaa agaaacaatt 1500 gagttaagtc ctactggtcg accaaaacga cgaactagaa aatcaataaa ttacagcaaa 1560 atagatgatt tccctaatga attggaaaaa ctgatcagtc aaatacagcc agaggtggac 1620 cgagaaagag ctgttgtgga agtgaatatc cctgtagaat ctgaagttaa tctgaagctg 1680 cagaatataa tgatgctact tcgtaaatgt tgtaatcatc catatttgat tgaatatcct 1740 atagaccctg ttacacaaga atttaagatc gatgaagaat tggtaacaaa ttctgggaag 1800 ttcttgattt tggatcgaat gctgccagaa ctaaaaaaaa gaggtcacaa ggtgctgctt 1860 ttttcacaaa tgacaagcat gttggacatt ttgatggatt actgccatct cagagatttc 1920 aacttcagca ggcttgatgg gtccatgtct tactcagaga gagaaaaaaa catgcacagc 1980 ttcaacacgg atccagaggt gtttatcttc ttagtgagta cacgagctgg tggcctgggc 2040 attaatctga ctgcagcaga tacagttatc atttatgata gtgattggaa cccccagtcg 2100 gatcttcagg cccaggatag atgtcataga attggtcaga caaagccagt tgttgtttat 2160 cgccttgtta cagcaaatac tatcgatcag aaaattgtgg aaagagcagc tgctaaaagg 2220 aaactggaaa agttgatcat ccataaaaat catttcaaag gtggtcagtc tggattaaat 2280 ctgtctaaga atttcttaga tcctaaggaa ttaatggaat tattaaaatc tagagattat 2340 gaaagggaaa taaaaggatc aagagagaag gtcattagtg ataaagatct agagttgttg 2400 ttagatcgaa gtgatcttat tgatcaaatg aatgcttcag gaccaattaa agagaagatg 2460 gggatattca agatattaga aaattctgaa gattccagtc ctgaatgttt gttttaa 2517 25 354 PRT Artificial Sequence consensus sequence 25 Tyr Gln Leu Glu Gly Val Asn Trp Leu Ile Ser Leu Tyr Lys Asn Leu 1 5 10 15 Val Glu Asn Asp Ser Cys Gly Leu Gly Gly Ile Leu Ala Asp Glu Met 20 25 30 Gly Leu Gly Lys Thr Leu Gln Thr Ile Ser Leu Leu Ala Tyr Leu Leu 35 40 45 Glu Leu Lys Pro Lys Ala Glu Lys Arg Ile Gly Pro Phe Leu Val Val 50 55 60 Cys Pro Leu Ser Thr Leu Asp Asn Trp Leu Asn Glu Phe Glu Lys Trp 65 70 75 80 Ala Pro Asp Asp Leu Asn Ile Val Val Tyr Tyr Gly Asp Gly Asn Ser 85 90 95 Arg Asp Gln Ile Arg Lys Asp Glu Leu Leu Arg Asn Phe Leu Lys Asp 100 105 110 Gly Gly Arg Leu Lys Tyr Asp Val Leu Ile Thr Ser Tyr Glu Ile Ile 115 120 125 Arg Lys Asn Lys Leu Leu Lys Asp Lys Asp Glu Leu Lys Lys Leu Glu 130 135 140 Ile Asn Trp Asp Tyr Leu Ile Leu Asp Glu Gly His His Arg Leu Lys 145 150 155 160 Asn Glu Asp Ser Lys Leu Arg Lys Ser Lys Ala Leu Asn Lys Leu Lys 165 170 175 His Thr Arg Asn Arg Leu Leu Leu Thr Gly Thr Pro Leu Gln Asn Asn 180 185 190 Leu Lys Glu Leu Trp Ser Leu Leu Asn Phe Leu Met Pro Gly Ile Phe 195 200 205 Gly Ser Leu Glu Glu Tyr Ile Asn Asp Gly Lys Ser Phe Asp Lys Trp 210 215 220 Phe Asn Asn Pro Ile Leu Glu Gly Arg Asp Ser Ala Val Leu Ala Asp 225 230 235 240 Ala Ser Glu Ala Leu Arg Arg Lys Lys Asp Val Glu Glu Gly Leu Lys 245 250 255 Leu Leu Asn Arg Leu His Lys Ile Leu Lys Pro Phe Leu Leu Arg Arg 260 265 270 Leu Lys Lys Asp Val Glu Lys Ser Leu Pro Pro Pro Glu Lys Thr Glu 275 280 285 Tyr Val Leu Phe Cys Lys Leu Ser Lys Leu Gln Lys Glu Leu Tyr Lys 290 295 300 Lys Phe Leu Lys Gly Glu Ser Lys Asp Val Lys Ala Ile Asn Asp Ser 305 310 315 320 Glu Arg Arg Glu Gly Gly Lys Glu Lys Asn Glu Gly Lys Ser Arg Leu 325 330 335 Leu Asn Leu Ile Met Gln Leu Arg Lys Ile Cys Asn Phe His Pro Tyr 340 345 350 Leu Phe 26 91 PRT Artificial Sequence consensus sequence 26 Glu Glu Leu Ala Lys Phe Leu Lys Glu Leu Phe Lys Lys Asn Pro Gly 1 5 10 15 Ile Lys Val Ala Tyr Leu His Gly Ser Leu Ser Gln Lys Glu Arg Asp 20 25 30 Lys Ile Leu Glu Asp Phe Asn Asp Gly Glu Asn Thr Glu Ile Ile Val 35 40 45 Val Leu Val Ala Thr Asp Val Ala Gly Arg Gly Ile Asp Leu Pro Asp 50 55 60 Val Asn Leu Val Ile Asn Tyr Asp Leu Pro Trp Asn Pro Glu Gln Tyr 65 70 75 80 Val Gln Arg Ile Gly Arg Thr Gly Arg Ala Gly 85 90 27 3502 DNA Homo sapiens CDS (386)...(2833) 27 tccgacccac gcgtccgact agttctagat cgcgatctag aactagcggg gacacactat 60 tgacagcaga aacaatgaat ttcctccaaa cccggcaatg ttggtggctc ttgcattcct 120 ctggatgagc gaatctagtt ggggggttcc cgaaggggaa ggcgcctggg ctttcaatac 180 atcctcctga atcatactgc gtttcaggtt ccttagaaaa atttggatgt gtaaaaagaa 240 ctcttaacgg cgatgcaggt cttccacagc taaggttgca ttggagtttt cgaaagactt 300 atctttctgc aggctcgcct ctgagctttg tctccttgga gccacctcac ttagacagct 360 tcggatgtgg atgcagattt gaacc atg ttg cgt ccc cag gga ctg cta tgg 412 Met Leu Arg Pro Gln Gly Leu Leu Trp 1 5 ctc cct ttg ttg ttc acc tct gtc tgt gtc atg tta aac tcc aat gtt 460 Leu Pro Leu Leu Phe Thr Ser Val Cys Val Met Leu Asn Ser Asn Val 10 15 20 25 ctt ctg tgg ata act gct ctt gcc atc aag ttc acc ctc att gac agc 508 Leu Leu Trp Ile Thr Ala Leu Ala Ile Lys Phe Thr Leu Ile Asp Ser 30 35 40 caa gca cag tat cca gtt gtc aac aca aat tat ggt aaa atc cag ggc 556 Gln Ala Gln Tyr Pro Val Val Asn Thr Asn Tyr Gly Lys Ile Gln Gly 45 50 55 cta aga aca cca tta ccc agt gag atc ttg ggt cca gtg gag cag tac 604 Leu Arg Thr Pro Leu Pro Ser Glu Ile Leu Gly Pro Val Glu Gln Tyr 60 65 70 tta ggg gtc ccc tat gcc tca ccc cca act gga gag agg cgg ttt cag 652 Leu Gly Val Pro Tyr Ala Ser Pro Pro Thr Gly Glu Arg Arg Phe Gln 75 80 85 cca cca gaa tcc cca tcc tcc tgg act ggc atc cga aat gct act cag 700 Pro Pro Glu Ser Pro Ser Ser Trp Thr Gly Ile Arg Asn Ala Thr Gln 90 95 100 105 ttt tct gct gtg tgc ccc cag cac ctg gat gaa aga ttc tta ttg cat 748 Phe Ser Ala Val Cys Pro Gln His Leu Asp Glu Arg Phe Leu Leu His 110 115 120 gac atg ctg ccc atc tgg ttt acc acc agt ttg gat act ttg atg acc 796 Asp Met Leu Pro Ile Trp Phe Thr Thr Ser Leu Asp Thr Leu Met Thr 125 130 135 tat gtt caa gat caa aat gaa gac tgc ctt tac tta aac atc tat gtg 844 Tyr Val Gln Asp Gln Asn Glu Asp Cys Leu Tyr Leu Asn Ile Tyr Val 140 145 150 ccc atg gaa gat gat att cat gaa cag aac agt aag aag cct gtt atg 892 Pro Met Glu Asp Asp Ile His Glu Gln Asn Ser Lys Lys Pro Val Met 155 160 165 gtc tat atc cat ggg gga tct tac atg gag gga acc ggt aac atg att 940 Val Tyr Ile His Gly Gly Ser Tyr Met Glu Gly Thr Gly Asn Met Ile 170 175 180 185 gat ggc agc att ttg gcc agc tat ggg aac gtc atc gtt atc acc att 988 Asp Gly Ser Ile Leu Ala Ser Tyr Gly Asn Val Ile Val Ile Thr Ile 190 195 200 aac tac cgt ctg gga ata cta ggg ttt tta agt acc ggt gac cag gca 1036 Asn Tyr Arg Leu Gly Ile Leu Gly Phe Leu Ser Thr Gly Asp Gln Ala 205 210 215 gca aaa ggc aac tat ggg ctc ctg gat cag att caa gca ctg agg tgg 1084 Ala Lys Gly Asn Tyr Gly Leu Leu Asp Gln Ile Gln Ala Leu Arg Trp 220 225 230 att gag gag aat gtc gga gcc ttt ggc ggg gac ccc aag aga gtg act 1132 Ile Glu Glu Asn Val Gly Ala Phe Gly Gly Asp Pro Lys Arg Val Thr 235 240 245 atc ttt ggc tcg ggg gct ggg gcc tcc tgt gtc agc ctg ttg acc ctg 1180 Ile Phe Gly Ser Gly Ala Gly Ala Ser Cys Val Ser Leu Leu Thr Leu 250 255 260 265 tcc cac tac tca gaa ggt ctc ttc cag aag gcc atc att cag agc ggc 1228 Ser His Tyr Ser Glu Gly Leu Phe Gln Lys Ala Ile Ile Gln Ser Gly 270 275 280 act gcc ctg tcc agc tgg gca gtg aac tac cag ccg gcc aag tac act 1276 Thr Ala Leu Ser Ser Trp Ala Val Asn Tyr Gln Pro Ala Lys Tyr Thr 285 290 295 cgg ata ttg gca gac aag gtc ggc tgc aac atg ctg gac acc acg gac 1324 Arg Ile Leu Ala Asp Lys Val Gly Cys Asn Met Leu Asp Thr Thr Asp 300 305 310 atg gta gaa tgt ctg aag aac aag aac tac aag gag ctc atc cag cag 1372 Met Val Glu Cys Leu Lys Asn Lys Asn Tyr Lys Glu Leu Ile Gln Gln 315 320 325 acc atc acc ccg gcc acc tac cac ata gcc ttt ggg ccg gtg atc gac 1420 Thr Ile Thr Pro Ala Thr Tyr His Ile Ala Phe Gly Pro Val Ile Asp 330 335 340 345 ggc gac gtc atc cca gac gac ccc cag atc ctg atg gag caa ggc gag 1468 Gly Asp Val Ile Pro Asp Asp Pro Gln Ile Leu Met Glu Gln Gly Glu 350 355 360 ttc ctc aac tac gac atc atg ctg ggc gtc aac caa ggg gaa ggc ctg 1516 Phe Leu Asn Tyr Asp Ile Met Leu Gly Val Asn Gln Gly Glu Gly Leu 365 370 375 aag ttc gtg gac ggc atc gtg gat aac gag gac ggt gtg acg ccc aac 1564 Lys Phe Val Asp Gly Ile Val Asp Asn Glu Asp Gly Val Thr Pro Asn 380 385 390 gac ttt gac ttc tcc gtg tcc aac ttc gtg gac aac ctt tac ggc tac 1612 Asp Phe Asp Phe Ser Val Ser Asn Phe Val Asp Asn Leu Tyr Gly Tyr 395 400 405 cct gaa ggg aaa gac act ttg cgg gag act atc aag ttc atg tac aca 1660 Pro Glu Gly Lys Asp Thr Leu Arg Glu Thr Ile Lys Phe Met Tyr Thr 410 415 420 425 gac tgg gcc gat aag gaa aac ccg gag acg cgg cgg aaa acc ctg gtg 1708 Asp Trp Ala Asp Lys Glu Asn Pro Glu Thr Arg Arg Lys Thr Leu Val 430 435 440 gct ctc ttt act gac cat cag tgg gtg gcc ccc gcc gtg gcc acc gcc 1756 Ala Leu Phe Thr Asp His Gln Trp Val Ala Pro Ala Val Ala Thr Ala 445 450 455 gac ctg cac gcg cag tac ggc tcc ccc acc tac ttc tat gcc ttc tat 1804 Asp Leu His Ala Gln Tyr Gly Ser Pro Thr Tyr Phe Tyr Ala Phe Tyr 460 465 470 cat cac tgc caa agc gaa atg aag ccc agc tgg gca gat tcg gcc cat 1852 His His Cys Gln Ser Glu Met Lys Pro Ser Trp Ala Asp Ser Ala His 475 480 485 ggc gat gaa gtc ccc tat gtc ttc ggc atc ccc atg atc ggt ccc aca 1900 Gly Asp Glu Val Pro Tyr Val Phe Gly Ile Pro Met Ile Gly Pro Thr 490 495 500 505 gag ctc ttc agt tgt aat ttc tcc aag aac gac gtc atg ctc agt gcc 1948 Glu Leu Phe Ser Cys Asn Phe Ser Lys Asn Asp Val Met Leu Ser Ala 510 515 520 gtg gtg atg acc tac tgg acg aac ttc gcc aaa act ggt gat cca aac 1996 Val Val Met Thr Tyr Trp Thr Asn Phe Ala Lys Thr Gly Asp Pro Asn 525 530 535 caa cca gtt cct cag gat acc aag ttc att cat aca aaa ccc aat cgc 2044 Gln Pro Val Pro Gln Asp Thr Lys Phe Ile His Thr Lys Pro Asn Arg 540 545 550 ttt gaa gaa gtg gcc tgg tcc aag tat aat ccc aaa gac cag ctc tat 2092 Phe Glu Glu Val Ala Trp Ser Lys Tyr Asn Pro Lys Asp Gln Leu Tyr 555 560 565 ctg cat att ggc ttg aaa ccc aga gtg aga gat cac tac cgg gca acg 2140 Leu His Ile Gly Leu Lys Pro Arg Val Arg Asp His Tyr Arg Ala Thr 570 575 580 585 aaa gtg gct ttc tgg ttg gaa ttg gtt cct cat ttg cac aac ttg aac 2188 Lys Val Ala Phe Trp Leu Glu Leu Val Pro His Leu His Asn Leu Asn 590 595 600 gag ata ttc cag tat gtt tca aca acc aca aag gtt cct cca cca gac 2236 Glu Ile Phe Gln Tyr Val Ser Thr Thr Thr Lys Val Pro Pro Pro Asp 605 610 615 atg aca tca ttt ccc tat ggc acc cgg cga tct ccc gcc aag ata tgg 2284 Met Thr Ser Phe Pro Tyr Gly Thr Arg Arg Ser Pro Ala Lys Ile Trp 620 625 630 cca acc acc aaa cgc cca gca atc act cct gcc aac aat ccc aaa cac 2332 Pro Thr Thr Lys Arg Pro Ala Ile Thr Pro Ala Asn Asn Pro Lys His 635 640 645 tct aag gac cct cac aaa aca ggg ccc gag gac aca act gtc ctc att 2380 Ser Lys Asp Pro His Lys Thr Gly Pro Glu Asp Thr Thr Val Leu Ile 650 655 660 665 gaa acc aaa cga gat tat tcc acc gaa tta agt gtc acc att gcc gtc 2428 Glu Thr Lys Arg Asp Tyr Ser Thr Glu Leu Ser Val Thr Ile Ala Val 670 675 680 ggg gcg tcg ctc ctc ttc ctc aac atc tta gcc ttt gcg gcg ctg tac 2476 Gly Ala Ser Leu Leu Phe Leu Asn Ile Leu Ala Phe Ala Ala Leu Tyr 685 690 695 tac aaa aag gac aag agg cgc cat gag act cac agg cac ccc agt ccc 2524 Tyr Lys Lys Asp Lys Arg Arg His Glu Thr His Arg His Pro Ser Pro 700 705 710 cag aga aac acc aca aat gat atc act cac atc cag aac gaa gag atc 2572 Gln Arg Asn Thr Thr Asn Asp Ile Thr His Ile Gln Asn Glu Glu Ile 715 720 725 atg tct ctg cag atg aag cag ctg gaa cac gat cac gag tgt gag tcg 2620 Met Ser Leu Gln Met Lys Gln Leu Glu His Asp His Glu Cys Glu Ser 730 735 740 745 ctg cag gca cac gac acg ctg agg ctc acc tgc cct cca gac tac acc 2668 Leu Gln Ala His Asp Thr Leu Arg Leu Thr Cys Pro Pro Asp Tyr Thr 750 755 760 ctc acg ctg cgc cgg tcg ccg gat gac atc cca ttt atg acg cca aac 2716 Leu Thr Leu Arg Arg Ser Pro Asp Asp Ile Pro Phe Met Thr Pro Asn 765 770 775 acc atc acc atg att cca aac aca ttg atg ggg atg cag cct tta cac 2764 Thr Ile Thr Met Ile Pro Asn Thr Leu Met Gly Met Gln Pro Leu His 780 785 790 act ttt aaa acc ttc agt gga gga caa aac agt aca aat tta ccc cac 2812 Thr Phe Lys Thr Phe Ser Gly Gly Gln Asn Ser Thr Asn Leu Pro His 795 800 805 gga cat tcc acc act aga gta tagcttttcc ctatttcccc tcctatccct 2863 Gly His Ser Thr Thr Arg Val 810 815 ctgcccctac tgctcagcaa tgtaaaagag acaaataagg agaaagaaaa tctccaaacc 2923 aggaatgttt ttgtgccact gactttagat aaaaatgcaa aagggcagtc atcctgtccc 2983 agcagaccct tctcattggc attttccagt attgtgagat caatttctga ccatatgaaa 3043 tgtgaaaagt atatgtttct gttacaatac tgctttaaga tctaaaccat gccaacagat 3103 gtttcgtgtg actaggacat caccatttca aggaactgtg tgtttccaac atcatggtag 3163 cagcacacac ttccaaagct cagccaggga cacttaatat tttttaatta caatggaaat 3223 ttaaacattt ttatgtgggc tacacaatgg atggctcttc ttaagtgaag aaagactcta 3283 taggctttta cacagcacat gaagcagtaa tccagaaaga aggaaatgca gaattttatt 3343 atcaaagtaa gcgaattgac tgtgcagaaa aattgtaggg ttctgtggaa ggaggtattc 3403 tgccagcctg aactatattt aagaaacttt gtaaaaaata aaaatgtata tagctgtgag 3463 ctcaaacaaa aactgcaaaa aaaaaaaaaa aaaaaaaaa 3502 28 816 PRT Homo sapiens 28 Met Leu Arg Pro Gln Gly Leu Leu Trp Leu Pro Leu Leu Phe Thr Ser 1 5 10 15 Val Cys Val Met Leu Asn Ser Asn Val Leu Leu Trp Ile Thr Ala Leu 20 25 30 Ala Ile Lys Phe Thr Leu Ile Asp Ser Gln Ala Gln Tyr Pro Val Val 35 40 45 Asn Thr Asn Tyr Gly Lys Ile Gln Gly Leu Arg Thr Pro Leu Pro Ser 50 55 60 Glu Ile Leu Gly Pro Val Glu Gln Tyr Leu Gly Val Pro Tyr Ala Ser 65 70 75 80 Pro Pro Thr Gly Glu Arg Arg Phe Gln Pro Pro Glu Ser Pro Ser Ser 85 90 95 Trp Thr Gly Ile Arg Asn Ala Thr Gln Phe Ser Ala Val Cys Pro Gln 100 105 110 His Leu Asp Glu Arg Phe Leu Leu His Asp Met Leu Pro Ile Trp Phe 115 120 125 Thr Thr Ser Leu Asp Thr Leu Met Thr Tyr Val Gln Asp Gln Asn Glu 130 135 140 Asp Cys Leu Tyr Leu Asn Ile Tyr Val Pro Met Glu Asp Asp Ile His 145 150 155 160 Glu Gln Asn Ser Lys Lys Pro Val Met Val Tyr Ile His Gly Gly Ser 165 170 175 Tyr Met Glu Gly Thr Gly Asn Met Ile Asp Gly Ser Ile Leu Ala Ser 180 185 190 Tyr Gly Asn Val Ile Val Ile Thr Ile Asn Tyr Arg Leu Gly Ile Leu 195 200 205 Gly Phe Leu Ser Thr Gly Asp Gln Ala Ala Lys Gly Asn Tyr Gly Leu 210 215 220 Leu Asp Gln Ile Gln Ala Leu Arg Trp Ile Glu Glu Asn Val Gly Ala 225 230 235 240 Phe Gly Gly Asp Pro Lys Arg Val Thr Ile Phe Gly Ser Gly Ala Gly 245 250 255 Ala Ser Cys Val Ser Leu Leu Thr Leu Ser His Tyr Ser Glu Gly Leu 260 265 270 Phe Gln Lys Ala Ile Ile Gln Ser Gly Thr Ala Leu Ser Ser Trp Ala 275 280 285 Val Asn Tyr Gln Pro Ala Lys Tyr Thr Arg Ile Leu Ala Asp Lys Val 290 295 300 Gly Cys Asn Met Leu Asp Thr Thr Asp Met Val Glu Cys Leu Lys Asn 305 310 315 320 Lys Asn Tyr Lys Glu Leu Ile Gln Gln Thr Ile Thr Pro Ala Thr Tyr 325 330 335 His Ile Ala Phe Gly Pro Val Ile Asp Gly Asp Val Ile Pro Asp Asp 340 345 350 Pro Gln Ile Leu Met Glu Gln Gly Glu Phe Leu Asn Tyr Asp Ile Met 355 360 365 Leu Gly Val Asn Gln Gly Glu Gly Leu Lys Phe Val Asp Gly Ile Val 370 375 380 Asp Asn Glu Asp Gly Val Thr Pro Asn Asp Phe Asp Phe Ser Val Ser 385 390 395 400 Asn Phe Val Asp Asn Leu Tyr Gly Tyr Pro Glu Gly Lys Asp Thr Leu 405 410 415 Arg Glu Thr Ile Lys Phe Met Tyr Thr Asp Trp Ala Asp Lys Glu Asn 420 425 430 Pro Glu Thr Arg Arg Lys Thr Leu Val Ala Leu Phe Thr Asp His Gln 435 440 445 Trp Val Ala Pro Ala Val Ala Thr Ala Asp Leu His Ala Gln Tyr Gly 450 455 460 Ser Pro Thr Tyr Phe Tyr Ala Phe Tyr His His Cys Gln Ser Glu Met 465 470 475 480 Lys Pro Ser Trp Ala Asp Ser Ala His Gly Asp Glu Val Pro Tyr Val 485 490 495 Phe Gly Ile Pro Met Ile Gly Pro Thr Glu Leu Phe Ser Cys Asn Phe 500 505 510 Ser Lys Asn Asp Val Met Leu Ser Ala Val Val Met Thr Tyr Trp Thr 515 520 525 Asn Phe Ala Lys Thr Gly Asp Pro Asn Gln Pro Val Pro Gln Asp Thr 530 535 540 Lys Phe Ile His Thr Lys Pro Asn Arg Phe Glu Glu Val Ala Trp Ser 545 550 555 560 Lys Tyr Asn Pro Lys Asp Gln Leu Tyr Leu His Ile Gly Leu Lys Pro 565 570 575 Arg Val Arg Asp His Tyr Arg Ala Thr Lys Val Ala Phe Trp Leu Glu 580 585 590 Leu Val Pro His Leu His Asn Leu Asn Glu Ile Phe Gln Tyr Val Ser 595 600 605 Thr Thr Thr Lys Val Pro Pro Pro Asp Met Thr Ser Phe Pro Tyr Gly 610 615 620 Thr Arg Arg Ser Pro Ala Lys Ile Trp Pro Thr Thr Lys Arg Pro Ala 625 630 635 640 Ile Thr Pro Ala Asn Asn Pro Lys His Ser Lys Asp Pro His Lys Thr 645 650 655 Gly Pro Glu Asp Thr Thr Val Leu Ile Glu Thr Lys Arg Asp Tyr Ser 660 665 670 Thr Glu Leu Ser Val Thr Ile Ala Val Gly Ala Ser Leu Leu Phe Leu 675 680 685 Asn Ile Leu Ala Phe Ala Ala Leu Tyr Tyr Lys Lys Asp Lys Arg Arg 690 695 700 His Glu Thr His Arg His Pro Ser Pro Gln Arg Asn Thr Thr Asn Asp 705 710 715 720 Ile Thr His Ile Gln Asn Glu Glu Ile Met Ser Leu Gln Met Lys Gln 725 730 735 Leu Glu His Asp His Glu Cys Glu Ser Leu Gln Ala His Asp Thr Leu 740 745 750 Arg Leu Thr Cys Pro Pro Asp Tyr Thr Leu Thr Leu Arg Arg Ser Pro 755 760 765 Asp Asp Ile Pro Phe Met Thr Pro Asn Thr Ile Thr Met Ile Pro Asn 770 775 780 Thr Leu Met Gly Met Gln Pro Leu His Thr Phe Lys Thr Phe Ser Gly 785 790 795 800 Gly Gln Asn Ser Thr Asn Leu Pro His Gly His Ser Thr Thr Arg Val 805 810 815 29 2451 DNA Homo sapiens 29 atgttgcgtc cccagggact gctatggctc cctttgttgt tcacctctgt ctgtgtcatg 60 ttaaactcca atgttcttct gtggataact gctcttgcca tcaagttcac cctcattgac 120 agccaagcac agtatccagt tgtcaacaca aattatggta aaatccaggg cctaagaaca 180 ccattaccca gtgagatctt gggtccagtg gagcagtact taggggtccc ctatgcctca 240 cccccaactg gagagaggcg gtttcagcca ccagaatccc catcctcctg gactggcatc 300 cgaaatgcta ctcagttttc tgctgtgtgc ccccagcacc tggatgaaag attcttattg 360 catgacatgc tgcccatctg gtttaccacc agtttggata ctttgatgac ctatgttcaa 420 gatcaaaatg aagactgcct ttacttaaac atctatgtgc ccatggaaga tgatattcat 480 gaacagaaca gtaagaagcc tgttatggtc tatatccatg ggggatctta catggaggga 540 accggtaaca tgattgatgg cagcattttg gccagctatg ggaacgtcat cgttatcacc 600 attaactacc gtctgggaat actagggttt ttaagtaccg gtgaccaggc agcaaaaggc 660 aactatgggc tcctggatca gattcaagca ctgaggtgga ttgaggagaa tgtcggagcc 720 tttggcgggg accccaagag agtgactatc tttggctcgg gggctggggc ctcctgtgtc 780 agcctgttga ccctgtccca ctactcagaa ggtctcttcc agaaggccat cattcagagc 840 ggcactgccc tgtccagctg ggcagtgaac taccagccgg ccaagtacac tcggatattg 900 gcagacaagg tcggctgcaa catgctggac accacggaca tggtagaatg tctgaagaac 960 aagaactaca aggagctcat ccagcagacc atcaccccgg ccacctacca catagccttt 1020 gggccggtga tcgacggcga cgtcatccca gacgaccccc agatcctgat ggagcaaggc 1080 gagttcctca actacgacat catgctgggc gtcaaccaag gggaaggcct gaagttcgtg 1140 gacggcatcg tggataacga ggacggtgtg acgcccaacg actttgactt ctccgtgtcc 1200 aacttcgtgg acaaccttta cggctaccct gaagggaaag acactttgcg ggagactatc 1260 aagttcatgt acacagactg ggccgataag gaaaacccgg agacgcggcg gaaaaccctg 1320 gtggctctct ttactgacca tcagtgggtg gcccccgccg tggccaccgc cgacctgcac 1380 gcgcagtacg gctcccccac ctacttctat gccttctatc atcactgcca aagcgaaatg 1440 aagcccagct gggcagattc ggcccatggc gatgaagtcc cctatgtctt cggcatcccc 1500 atgatcggtc ccacagagct cttcagttgt aatttctcca agaacgacgt catgctcagt 1560 gccgtggtga tgacctactg gacgaacttc gccaaaactg gtgatccaaa ccaaccagtt 1620 cctcaggata ccaagttcat tcatacaaaa cccaatcgct ttgaagaagt ggcctggtcc 1680 aagtataatc ccaaagacca gctctatctg catattggct tgaaacccag agtgagagat 1740 cactaccggg caacgaaagt ggctttctgg ttggaattgg ttcctcattt gcacaacttg 1800 aacgagatat tccagtatgt ttcaacaacc acaaaggttc ctccaccaga catgacatca 1860 tttccctatg gcacccggcg atctcccgcc aagatatggc caaccaccaa acgcccagca 1920 atcactcctg ccaacaatcc caaacactct aaggaccctc acaaaacagg gcccgaggac 1980 acaactgtcc tcattgaaac caaacgagat tattccaccg aattaagtgt caccattgcc 2040 gtcggggcgt cgctcctctt cctcaacatc ttagcctttg cggcgctgta ctacaaaaag 2100 gacaagaggc gccatgagac tcacaggcac cccagtcccc agagaaacac cacaaatgat 2160 atcactcaca tccagaacga agagatcatg tctctgcaga tgaagcagct ggaacacgat 2220 cacgagtgtg agtcgctgca ggcacacgac acgctgaggc tcacctgccc tccagactac 2280 accctcacgc tgcgccggtc gccggatgac atcccattta tgacgccaaa caccatcacc 2340 atgattccaa acacattgat ggggatgcag cctttacaca cttttaaaac cttcagtgga 2400 ggacaaaaca gtacaaattt accccacgga cattccacca ctagagtata g 2451 30 612 PRT Artificial Sequence consensus sequence 30 Met Val Leu Leu Leu Leu Phe Leu Leu Leu Leu Leu Leu Leu Ile Ala 1 5 10 15 Val Leu Ala Ala Ala Lys Ala Ser Pro Glu Asp Pro Leu Leu Val Ala 20 25 30 Thr Asn Asn Val Leu Cys Gly Lys Val Arg Gly Val Asn Glu Lys Thr 35 40 45 Asp Asn Gly Glu Gln Ser Val Tyr Ser Phe Leu Gly Ile Pro Tyr Ala 50 55 60 Glu Pro Pro Val Gly Asn Leu Arg Phe Lys Ala Pro Gln Pro Tyr Lys 65 70 75 80 Glu Pro Trp Ser Asp Val Leu Asp Ala Thr Lys Tyr Pro Pro Ser Cys 85 90 95 Leu Gln Asp Asp Asp Phe Gly Phe Ser Leu Ser Asp Leu Lys Val Ala 100 105 110 Leu Lys Met Leu Ser Leu Gly Trp Asn Lys Leu Val Gly Leu Lys Leu 115 120 125 Ser Glu Asp Cys Leu Tyr Leu Asn Val Tyr Thr Pro Lys Asn Thr Lys 130 135 140 Pro Asn Ser Lys Leu Pro Val Met Val Trp Ile His Gly Gly Gly Phe 145 150 155 160 Met Phe Gly Ser Gly His Ser Leu Pro Leu Ser Leu Tyr Asp Gly Glu 165 170 175 Ser Leu Ala Arg Glu Gly Asn Val Ile Val Val Ser Ile Asn Tyr Arg 180 185 190 Leu Gly Pro Leu Gly Phe Leu Ser Thr Gly Asp Asp Lys Leu Pro Gly 195 200 205 Ser Gly Asn Tyr Gly Leu Leu Asp Gln Arg Leu Ala Leu Lys Trp Val 210 215 220 Gln Asp Asn Ile Ala Ala Phe Gly Gly Asp Pro Asn Ser Val Thr Ile 225 230 235 240 Phe Gly Glu Ser Ala Gly Ala Ala Ser Val Ser Leu Leu Leu Leu Ser 245 250 255 Asn Gly Gly Asp Asn Pro Pro Ser Ser Lys Gly Leu Phe His Arg Ala 260 265 270 Ile Ser Gln Ser Gly Ser Ala Leu Ser Pro Trp Ala Ile Gln Ser Glu 275 280 285 Ser Asn Ala Arg Gly Arg Ala Lys Glu Leu Ala Arg Leu Leu Gly Cys 290 295 300 Asn Glu Thr Ser Ser Ser Glu Leu Leu Asp Cys Leu Arg Ser Lys Ser 305 310 315 320 Ala Glu Glu Leu Leu Glu Ala Thr Arg Ser Phe Leu Leu Phe Glu Tyr 325 330 335 Val Pro Phe Leu Pro Leu Phe Leu Ala Phe Gly Pro Val Val Asp Gly 340 345 350 Asp Asp Ala Pro Glu Ala Phe Ile Pro Glu Asp Pro Glu Glu Leu Ile 355 360 365 Lys Glu Gly Lys Phe Ala Asp Val Pro Tyr Leu Ile Gly Val Thr Lys 370 375 380 Asp Glu Gly Gly Tyr Phe Ala Ala Met Leu Leu Asn Ala Ser Ser Lys 385 390 395 400 Gly Glu Asp Glu Leu Lys Lys Glu Thr Asn Pro Asp Val Trp Leu Glu 405 410 415 Leu Leu Lys Tyr Leu Leu Phe Tyr Ala Ser Glu Ala Leu Asn Ile Lys 420 425 430 Asp Met Asp Asp Leu Ala Asp Lys Val Leu Glu Lys Tyr Pro Gly Asp 435 440 445 Val Asp Asp Phe Ser Val Glu Ser Arg Lys Pro Asn Leu Gln Asp Met 450 455 460 Leu Thr Asp Leu Leu Phe Lys Cys Pro Thr Arg Val Ala Ala Asp Leu 465 470 475 480 His Ala Lys His Gly Gly Ser Pro Val Tyr Ala Tyr Val Phe Asp His 485 490 495 Pro Ala Ser Phe Gly Ile Gly Gln Phe Leu Ala Lys Arg Val Asp Pro 500 505 510 Glu Phe Gly Gly Ala Val His Gly Asp Glu Ile Phe Phe Val Phe Gly 515 520 525 Asn Pro Leu Leu Lys Glu Gln Leu Tyr Lys Ala Thr Glu Glu Glu Glu 530 535 540 Lys Ser Ser Ser Lys Thr Met Met Asn Tyr Trp Ala Asn Phe Ala Lys 545 550 555 560 Thr Gly Asn Pro Asn Asn Gly Thr Ser Asn Gly Leu Val Val Trp Pro 565 570 575 Lys Tyr Thr Ser Glu Glu Gln Lys Tyr Ser Leu Leu Ile Leu Leu Thr 580 585 590 Thr Ile Thr Ala Gln Lys Leu Lys Ala Arg Asp Pro Arg Lys Val Leu 595 600 605 Cys Asn Phe Trp 610 31 848 PRT Rattus norvegicus 31 Met Trp Leu Gln Leu Gly Leu Pro Ser Leu Ser Leu Ser Pro Thr Pro 1 5 10 15 Thr Val Gly Arg Ser Leu Cys Leu Ile Leu Trp Phe Leu Ser Leu Val 20 25 30 Leu Arg Ala Ser Thr Gln Ala Pro Ala Pro Thr Val Asn Thr His Phe 35 40 45 Gly Lys Leu Arg Gly Ala Arg Val Pro Leu Pro Ser Glu Ile Leu Gly 50 55 60 Pro Val Asp Gln Tyr Leu Gly Val Pro Tyr Ala Ala Pro Pro Ile Gly 65 70 75 80 Glu Lys Arg Phe Leu Pro Pro Glu Pro Pro Pro Ser Trp Ser Gly Ile 85 90 95 Arg Asn Ala Thr His Phe Pro Pro Val Cys Pro Gln Asn Ile His Thr 100 105 110 Ala Val Pro Glu Val Met Leu Pro Val Trp Phe Thr Ala Asn Leu Asp 115 120 125 Ile Val Ala Thr Tyr Ile Gln Glu Pro Asn Glu Asp Cys Leu Tyr Leu 130 135 140 Asn Val Tyr Val Pro Thr Glu Asp Val Lys Arg Ile Ser Lys Glu Cys 145 150 155 160 Ala Arg Lys Pro Asn Lys Lys Ile Cys Arg Lys Gly Gly Ser Gly Ala 165 170 175 Lys Lys Gln Gly Glu Asp Leu Ala Asp Asn Asp Gly Asp Glu Asp Glu 180 185 190 Asp Ile Arg Asp Ser Gly Ala Lys Pro Val Met Val Tyr Ile His Gly 195 200 205 Gly Ser Tyr Met Glu Gly Thr Gly Asn Met Ile Asp Gly Ser Val Leu 210 215 220 Ala Ser Tyr Gly Asn Val Ile Val Ile Thr Leu Asn Tyr Arg Val Gly 225 230 235 240 Val Leu Gly Phe Leu Ser Thr Gly Asp Gln Ala Ala Lys Gly Asn Tyr 245 250 255 Gly Leu Leu Asp Gln Ile Gln Ala Leu Arg Trp Val Ser Glu Asn Ile 260 265 270 Ala Phe Phe Gly Gly Asp Pro Arg Arg Ile Thr Val Phe Gly Ser Gly 275 280 285 Ile Gly Ala Ser Cys Val Ser Leu Leu Thr Leu Ser His His Ser Glu 290 295 300 Gly Leu Phe Gln Arg Ala Ile Ile Gln Ser Gly Ser Ala Leu Ser Ser 305 310 315 320 Trp Ala Val Asn Tyr Gln Pro Val Lys Tyr Thr Ser Leu Leu Ala Asp 325 330 335 Lys Val Gly Cys Asn Val Leu Asp Thr Val Asp Met Val Asp Cys Leu 340 345 350 Arg Gln Lys Ser Ala Lys Glu Leu Val Glu Gln Asp Ile Gln Pro Ala 355 360 365 Arg Tyr His Val Ala Phe Gly Pro Val Ile Asp Gly Asp Val Ile Pro 370 375 380 Asp Asp Pro Glu Ile Leu Met Glu Gln Gly Glu Phe Leu Asn Tyr Asp 385 390 395 400 Ile Met Leu Gly Val Asn Gln Gly Glu Gly Leu Lys Phe Val Glu Gly 405 410 415 Val Val Asp Pro Glu Asp Gly Val Ser Gly Thr Asp Phe Asp Tyr Ser 420 425 430 Val Ser Asn Phe Val Asp Asn Leu Tyr Gly Tyr Pro Glu Gly Lys Asp 435 440 445 Thr Leu Arg Glu Thr Ile Lys Phe Met Tyr Thr Asp Trp Ala Asp Arg 450 455 460 Asp Asn Pro Glu Thr Arg Arg Lys Thr Leu Val Ala Leu Phe Thr Asp 465 470 475 480 His Gln Trp Val Glu Pro Ser Val Val Thr Ala Asp Leu His Ala Arg 485 490 495 Tyr Gly Ser Pro Thr Tyr Phe Tyr Ala Phe Tyr His His Cys Gln Ser 500 505 510 Leu Met Lys Pro Ala Trp Ser Asp Ala Ala His Gly Asp Glu Val Pro 515 520 525 Tyr Val Phe Gly Val Pro Met Val Gly Pro Thr Asp Leu Phe Pro Cys 530 535 540 Asn Phe Ser Lys Asn Asp Val Met Leu Ser Ala Val Val Met Thr Tyr 545 550 555 560 Trp Thr Asn Phe Ala Lys Thr Gly Asp Pro Asn Lys Pro Val Pro Gln 565 570 575 Asp Thr Lys Phe Ile His Thr Lys Ala Asn Arg Phe Glu Glu Val Ala 580 585 590 Trp Ser Lys Tyr Asn Pro Arg Asp Gln Leu Tyr Leu His Ile Gly Leu 595 600 605 Lys Pro Arg Val Arg Asp His Tyr Arg Ala Thr Lys Val Ala Phe Trp 610 615 620 Lys His Leu Val Pro His Leu Tyr Asn Leu His Asp Met Phe His Tyr 625 630 635 640 Thr Ser Thr Thr Thr Lys Val Pro Pro Pro Asp Thr Thr His Ser Ser 645 650 655 His Ile Thr Arg Arg Pro Asn Gly Lys Thr Trp Ser Thr Lys Arg Pro 660 665 670 Ala Ile Ser Pro Ala Tyr Ser Asn Glu Asn Ala Pro Gly Ser Trp Asn 675 680 685 Gly Asp Gln Asp Ala Gly Pro Leu Leu Val Glu Asn Pro Arg Asp Tyr 690 695 700 Ser Thr Glu Leu Ser Val Thr Ile Ala Val Gly Ala Ser Leu Leu Phe 705 710 715 720 Leu Asn Val Leu Ala Phe Ala Ala Leu Tyr Tyr Arg Lys Asp Lys Arg 725 730 735 Arg Gln Glu Pro Leu Arg Gln Pro Ser Pro Gln Arg Gly Thr Gly Ala 740 745 750 Pro Glu Leu Gly Thr Ala Pro Glu Glu Glu Leu Ala Ala Leu Gln Leu 755 760 765 Gly Pro Thr His His Glu Cys Glu Ala Gly Pro Pro His Asp Thr Leu 770 775 780 Arg Leu Thr Ala Leu Pro Asp Tyr Thr Leu Thr Leu Arg Arg Ser Pro 785 790 795 800 Asp Asp Ile Pro Leu Met Thr Pro Asn Thr Ile Thr Met Ile Pro Asn 805 810 815 Ser Leu Val Gly Leu Gln Thr Leu His Pro Tyr Asn Thr Phe Ala Ala 820 825 830 Gly Phe Asn Ser Thr Gly Leu Pro Asn Ser His Ser Thr Thr Arg Val 835 840 845 32 6 PRT Artificial Sequence signature domain sequence 32 Glu Asp Xaa Cys Leu Tyr 1 5 33 2305 DNA Homo sapiens CDS (33)...(1058) 33 ggctcgccag gacctggcaa ggcttgttta ct atg gcc gat gat ctg gag cag 53 Met Ala Asp Asp Leu Glu Gln 1 5 cag tct caa ggc tgg ctg agt agc tgg ctg ccc acg tgg cgc ccc act 101 Gln Ser Gln Gly Trp Leu Ser Ser Trp Leu Pro Thr Trp Arg Pro Thr 10 15 20 tcc atg tct cag ctg aag aat gtg gaa gcc agg atc ctc cag tgt ctc 149 Ser Met Ser Gln Leu Lys Asn Val Glu Ala Arg Ile Leu Gln Cys Leu 25 30 35 cag aat aag ttc ctg gcc aga tat gta tcc ctc cca aac cag aat aag 197 Gln Asn Lys Phe Leu Ala Arg Tyr Val Ser Leu Pro Asn Gln Asn Lys 40 45 50 55 atc tgg acg gtg act gtg agc ccc gag caa aac gac cgc acc ccc ttg 245 Ile Trp Thr Val Thr Val Ser Pro Glu Gln Asn Asp Arg Thr Pro Leu 60 65 70 gtg atg gtg cat ggt ttt ggg ggc ggc gtg ggt ctc tgg atc ctc aac 293 Val Met Val His Gly Phe Gly Gly Gly Val Gly Leu Trp Ile Leu Asn 75 80 85 atg gac tca ctg agt gcc cgc cgc aca ctg cac acc ttc gat ctg ctt 341 Met Asp Ser Leu Ser Ala Arg Arg Thr Leu His Thr Phe Asp Leu Leu 90 95 100 ggc ttc ggg cga agc tca agg cca gca ttc cca agg gac ccg gag ggg 389 Gly Phe Gly Arg Ser Ser Arg Pro Ala Phe Pro Arg Asp Pro Glu Gly 105 110 115 gct gag gat gag ttt gtg aca tcg ata gag aca tgg cgg gag acc atg 437 Ala Glu Asp Glu Phe Val Thr Ser Ile Glu Thr Trp Arg Glu Thr Met 120 125 130 135 ggg atc ccc agc atg atc ctc ctg ggg cac agt ttg gga gga ttc ctg 485 Gly Ile Pro Ser Met Ile Leu Leu Gly His Ser Leu Gly Gly Phe Leu 140 145 150 gcc act tct tac tca atc aag tac cct gat aga gtt aaa cac ctc atc 533 Ala Thr Ser Tyr Ser Ile Lys Tyr Pro Asp Arg Val Lys His Leu Ile 155 160 165 ctg gtg gac cca tgg ggc ttt ccc ctc cga cca act aac ccc agt gag 581 Leu Val Asp Pro Trp Gly Phe Pro Leu Arg Pro Thr Asn Pro Ser Glu 170 175 180 atc cgt gca ccc cca gcc tgg gtc aaa gcc gtg gca tct gtc cta gga 629 Ile Arg Ala Pro Pro Ala Trp Val Lys Ala Val Ala Ser Val Leu Gly 185 190 195 cgt tcc aat cca ttg gct gtt ctt cga gta gct ggg ccc tgg ggg cct 677 Arg Ser Asn Pro Leu Ala Val Leu Arg Val Ala Gly Pro Trp Gly Pro 200 205 210 215 ggt ctg gtg cag cga ttc cgg ccg gac ttc aaa cgc aag ttt gca gac 725 Gly Leu Val Gln Arg Phe Arg Pro Asp Phe Lys Arg Lys Phe Ala Asp 220 225 230 ttc ttt gaa gat gat acc ata tca gag tat att tac cac tgc aac gca 773 Phe Phe Glu Asp Asp Thr Ile Ser Glu Tyr Ile Tyr His Cys Asn Ala 235 240 245 cag aat ccc agt ggt gag aca gca ttc aaa gcc atg atg gag tcc ttt 821 Gln Asn Pro Ser Gly Glu Thr Ala Phe Lys Ala Met Met Glu Ser Phe 250 255 260 ggc tgg gcc cgg cgc cct atg ctg gag cga att cac ttg att cga aaa 869 Gly Trp Ala Arg Arg Pro Met Leu Glu Arg Ile His Leu Ile Arg Lys 265 270 275 gat gtg cct atc act atg atc tac ggg tcc gac acc tgg ata gat acc 917 Asp Val Pro Ile Thr Met Ile Tyr Gly Ser Asp Thr Trp Ile Asp Thr 280 285 290 295 agt acg gga aaa aag gtg aag atg cag cgg ccg gat tcc tat gtc cga 965 Ser Thr Gly Lys Lys Val Lys Met Gln Arg Pro Asp Ser Tyr Val Arg 300 305 310 gac atg gag att aag ggt gcc tcc cac cat gtc tat gct gac cag cca 1013 Asp Met Glu Ile Lys Gly Ala Ser His His Val Tyr Ala Asp Gln Pro 315 320 325 cac atc ttc aat gct gtg gtg gag gag atc tgc gac tca gtt gat 1058 His Ile Phe Asn Ala Val Val Glu Glu Ile Cys Asp Ser Val Asp 330 335 340 tgagctgctc tctgaagagg aagaggagaa agccagagag tcactcttac ctccctgtct 1118 gcttactcac ccactctgtc ctttcctcac caactaacat gtgccagcca ggcagagtct 1178 tgtgctgttc ccagaacagg acgacagtga aaagaacact cttgacccta cactgaaggc 1238 tgaaggcaga agccacaaga ggccttgagt gccaccccca gggaagaaca taaagggttg 1298 cacaatgcca cccatccact ccttgccaag tgttacccag atggtggagg atgtgaaggg 1358 attgcaccaa gccacattca ctctctctgt ggcctttctt cctctgggca aagaagggct 1418 tccagtggcc tttcctcact ctgtagtgtt tgtggggata ggttccatgc aagaacacct 1478 tcctcctcca tcccccactt caccccatcc cataccagtt ccatccaggg tctgcttaac 1538 tgccaagagc aggtcctgga gttcccttca cctgcagagt ccttttcatg acctaggagg 1598 tcttattcaa agccctcatt gacagaggag gaaacaggcc aaggcaggac atggctggac 1658 catggtgata cagctctgtg tgattcaagt tctggcagag cttgtaaggc tagagcccag 1718 gtctgccgac accctgtgct tgttgcacac ttgatttgct aaggctggag acaggcacca 1778 ttgccatggg gctggtccta gtcactggcc gaggataagc ccgtccctgt cccacattct 1838 agccccacta tgcgggggtg ctgttgtcct gcctgtgtct catccccagc tgcctaagct 1898 agggacactc aagtgcttcc ttccttgccc catcttcctc ccaactggag gcctctgagc 1958 ctcccctgtg ccttgggccc tgaagcccca tatgtagtat agagcaaagg tggctcctgg 2018 tgaagagagg gtggaaaggc ccttcagccc cagggccatg tctgggttct ccatgcccat 2078 cagtctctgc agtttctcta cctgccccca gagctgaggc catctgcaag cccctgccca 2138 tggcccaatg gggagcctcc agccacaagt tccctgtcct tatcagccac tgggtggttc 2198 ccactgcatg accctctatc cctgccatct gtccccatgg tttccagctc aatccacccc 2258 tgacccatct gtcagctttt tcccagggag ccgtttcagg ggttctg 2305 34 342 PRT Homo sapiens 34 Met Ala Asp Asp Leu Glu Gln Gln Ser Gln Gly Trp Leu Ser Ser Trp 1 5 10 15 Leu Pro Thr Trp Arg Pro Thr Ser Met Ser Gln Leu Lys Asn Val Glu 20 25 30 Ala Arg Ile Leu Gln Cys Leu Gln Asn Lys Phe Leu Ala Arg Tyr Val 35 40 45 Ser Leu Pro Asn Gln Asn Lys Ile Trp Thr Val Thr Val Ser Pro Glu 50 55 60 Gln Asn Asp Arg Thr Pro Leu Val Met Val His Gly Phe Gly Gly Gly 65 70 75 80 Val Gly Leu Trp Ile Leu Asn Met Asp Ser Leu Ser Ala Arg Arg Thr 85 90 95 Leu His Thr Phe Asp Leu Leu Gly Phe Gly Arg Ser Ser Arg Pro Ala 100 105 110 Phe Pro Arg Asp Pro Glu Gly Ala Glu Asp Glu Phe Val Thr Ser Ile 115 120 125 Glu Thr Trp Arg Glu Thr Met Gly Ile Pro Ser Met Ile Leu Leu Gly 130 135 140 His Ser Leu Gly Gly Phe Leu Ala Thr Ser Tyr Ser Ile Lys Tyr Pro 145 150 155 160 Asp Arg Val Lys His Leu Ile Leu Val Asp Pro Trp Gly Phe Pro Leu 165 170 175 Arg Pro Thr Asn Pro Ser Glu Ile Arg Ala Pro Pro Ala Trp Val Lys 180 185 190 Ala Val Ala Ser Val Leu Gly Arg Ser Asn Pro Leu Ala Val Leu Arg 195 200 205 Val Ala Gly Pro Trp Gly Pro Gly Leu Val Gln Arg Phe Arg Pro Asp 210 215 220 Phe Lys Arg Lys Phe Ala Asp Phe Phe Glu Asp Asp Thr Ile Ser Glu 225 230 235 240 Tyr Ile Tyr His Cys Asn Ala Gln Asn Pro Ser Gly Glu Thr Ala Phe 245 250 255 Lys Ala Met Met Glu Ser Phe Gly Trp Ala Arg Arg Pro Met Leu Glu 260 265 270 Arg Ile His Leu Ile Arg Lys Asp Val Pro Ile Thr Met Ile Tyr Gly 275 280 285 Ser Asp Thr Trp Ile Asp Thr Ser Thr Gly Lys Lys Val Lys Met Gln 290 295 300 Arg Pro Asp Ser Tyr Val Arg Asp Met Glu Ile Lys Gly Ala Ser His 305 310 315 320 His Val Tyr Ala Asp Gln Pro His Ile Phe Asn Ala Val Val Glu Glu 325 330 335 Ile Cys Asp Ser Val Asp 340 35 1029 DNA Homo sapiens 35 atggccgatg atctggagca gcagtctcaa ggctggctga gtagctggct gcccacgtgg 60 cgccccactt ccatgtctca gctgaagaat gtggaagcca ggatcctcca gtgtctccag 120 aataagttcc tggccagata tgtatccctc ccaaaccaga ataagatctg gacggtgact 180 gtgagccccg agcaaaacga ccgcaccccc ttggtgatgg tgcatggttt tgggggcggc 240 gtgggtctct ggatcctcaa catggactca ctgagtgccc gccgcacact gcacaccttc 300 gatctgcttg gcttcgggcg aagctcaagg ccagcattcc caagggaccc ggagggggct 360 gaggatgagt ttgtgacatc gatagagaca tggcgggaga ccatggggat ccccagcatg 420 atcctcctgg ggcacagttt gggaggattc ctggccactt cttactcaat caagtaccct 480 gatagagtta aacacctcat cctggtggac ccatggggct ttcccctccg accaactaac 540 cccagtgaga tccgtgcacc cccagcctgg gtcaaagccg tggcatctgt cctaggacgt 600 tccaatccat tggctgttct tcgagtagct gggccctggg ggcctggtct ggtgcagcga 660 ttccggccgg acttcaaacg caagtttgca gacttctttg aagatgatac catatcagag 720 tatatttacc actgcaacgc acagaatccc agtggtgaga cagcattcaa agccatgatg 780 gagtcctttg gctgggcccg gcgccctatg ctggagcgaa ttcacttgat tcgaaaagat 840 gtgcctatca ctatgatcta cgggtccgac acctggatag ataccagtac gggaaaaaag 900 gtgaagatgc agcggccgga ttcctatgtc cgagacatgg agattaaggg tgcctcccac 960 catgtctatg ctgaccagcc acacatcttc aatgctgtgg tggaggagat ctgcgactca 1020 gttgattga 1029 36 232 PRT Artificial Sequence consensus sequence 36 Phe Arg Val Ile Leu Leu Asp Leu Arg Gly Phe Gly Glu Ser Ser Pro 1 5 10 15 Ser Asp Leu Ala Glu Tyr Arg Phe Asp Asp Leu Ala Glu Asp Leu Glu 20 25 30 Ala Leu Leu Asp Ala Leu Gly Leu Glu Lys Pro Val Ile Leu Val Gly 35 40 45 His Ser Met Gly Gly Ala Ile Ala Leu Ala Tyr Ala Ala Lys Tyr Pro 50 55 60 Glu Leu Arg Val Lys Ala Leu Val Leu Val Ser Pro Pro Leu Pro Ala 65 70 75 80 Gly Leu Ser Ser Asp Leu Phe Pro Arg Gln Gly Asn Leu Glu Gly Leu 85 90 95 Leu Leu Ala Asn Phe Arg Asn Arg Leu Ser Arg Ser Val Glu Ala Leu 100 105 110 Leu Gly Arg Ala Leu Lys Gln Phe Phe Leu Leu Gly Arg Pro Leu Val 115 120 125 Ser Asp Phe Leu Lys Gln Ala Glu Asp Trp Leu Ser Ser Leu Ile Arg 130 135 140 Gln Gly Glu Asp Asp Gly Gly Asp Gly Leu Leu Gly Ala Ala Val Ala 145 150 155 160 Leu Gly Lys Leu Leu Gln Trp Asp Leu Ser Ala Leu Lys Asp Ile Lys 165 170 175 Val Pro Thr Leu Val Ile Trp Gly Thr Asp Asp Pro Leu Val Pro Leu 180 185 190 Asp Ala Ser Glu Lys Leu Ser Ala Leu Ile Pro Asn Ala Glu Val Val 195 200 205 Val Ile Asp Asp Ala Gly His Leu Ala Leu Leu Glu Lys Pro Glu Glu 210 215 220 Val Ala Glu Leu Ile Lys Phe Leu 225 230 37 62 PRT Artificial Sequence consensus sequence 37 Gln Arg Gln Gly Trp Leu Thr Gly Trp Leu Pro Thr Trp Cys Pro Thr 1 5 10 15 Ser Met Ser His Leu Lys Asn Ala Glu Glu Arg Met Leu Gln Cys Leu 20 25 30 Pro Cys Lys Tyr Lys Lys Arg Pro Val Arg Ile Pro Asn Gly Asn Lys 35 40 45 Ile Trp Thr Leu Lys Phe Ser His Asn Gln Asn Asn Arg Thr 50 55 60 38 93 PRT Artificial Sequence consensus sequence 38 Pro Pro Ala Trp Val Arg Ala Leu Gly Gly Val Leu Gly Pro Phe Asn 1 5 10 15 Pro Leu Ala Ala Leu Arg Leu Val Gly Pro Tyr Gly Pro Ser Leu Val 20 25 30 Lys Arg Leu Arg Pro Asp Leu Met Arg Lys Tyr Ser Glu Asp His Glu 35 40 45 Tyr Asp Thr Asn Leu Val Tyr Asp Tyr Ile Tyr Tyr Cys Asn Ser Gln 50 55 60 Asn Pro Thr Gly Glu Thr Ala Phe Lys Asn Met Thr Glu Asn Leu Gly 65 70 75 80 Trp Ala Lys Arg Pro Met Ile Lys Arg Phe Arg Leu Met 85 90 39 63 PRT Artificial Sequence consensus sequence 39 Asp Val Pro Val Thr Phe Ile Tyr Gly Ser Arg Ser Trp Ile Asp Trp 1 5 10 15 Asn Thr Gly Arg Lys Ile Lys Gly Gln Arg Pro His Ser Tyr Val Glu 20 25 30 Thr His Ile Ile Glu Gly Ala Gly His His Val Tyr Ala Asp Gln Pro 35 40 45 Asp Glu Phe Asn Gln Leu Val Asn Glu Thr Cys Asp Ser Val Asp 50 55 60 40 6 PRT Artificial Sequence exemplary motif 40 Gly Xaa Ser Xaa Gly Gly 1 5 41 1579 DNA Homo sapiens CDS (66)...(1304) 41 cggagcttcc gaaccaggcg ggattccacc gggtatttgc ctgcggaggc gggacttcgg 60 gcttg atg ggc gtt ggg ggt ggc ctt cct gcg ggc agg ctc tct gtg tcg 110 Met Gly Val Gly Gly Gly Leu Pro Ala Gly Arg Leu Ser Val Ser 1 5 10 15 caa cac tgg cgg ggc ggg cca aat cgg cca gag ctc tgc ccc cag agg 158 Gln His Trp Arg Gly Gly Pro Asn Arg Pro Glu Leu Cys Pro Gln Arg 20 25 30 acg cgg cta agc ccg ggg gcg tgt cct ggg ctg gcc cca ccc gcg ccc 206 Thr Arg Leu Ser Pro Gly Ala Cys Pro Gly Leu Ala Pro Pro Ala Pro 35 40 45 cgc ccc gcc ccg ccc ggt cgc gga gct gcg gcc agc ttt ggg agg gcc 254 Arg Pro Ala Pro Pro Gly Arg Gly Ala Ala Ala Ser Phe Gly Arg Ala 50 55 60 ggc ccc ggg atg cta cac aca acc cag ctg tac cag cat gtg cca gag 302 Gly Pro Gly Met Leu His Thr Thr Gln Leu Tyr Gln His Val Pro Glu 65 70 75 aca cgc tgg cca atc gtg tac tcg ccg cgc tac aac atc acc ttc atg 350 Thr Arg Trp Pro Ile Val Tyr Ser Pro Arg Tyr Asn Ile Thr Phe Met 80 85 90 95 ggc ctg gag aag ctg cat ccc ttt gat gcc gga aaa tgg ggc aaa gtg 398 Gly Leu Glu Lys Leu His Pro Phe Asp Ala Gly Lys Trp Gly Lys Val 100 105 110 atc aat ttc cta aaa gaa gag aag ctt ctg tct gac agc atg ctg gtg 446 Ile Asn Phe Leu Lys Glu Glu Lys Leu Leu Ser Asp Ser Met Leu Val 115 120 125 gag gcg cgg gag gcc tcg gag gag gac ctg ctg gtg gtg cac acg agg 494 Glu Ala Arg Glu Ala Ser Glu Glu Asp Leu Leu Val Val His Thr Arg 130 135 140 cgc tat ctt aat gag ctc aag tgg tcc ttt gct gtt gct acc atc aca 542 Arg Tyr Leu Asn Glu Leu Lys Trp Ser Phe Ala Val Ala Thr Ile Thr 145 150 155 gaa atc ccc ccc gtt atc ttc ctc ccc aac ttc ctt gtg cag agg aag 590 Glu Ile Pro Pro Val Ile Phe Leu Pro Asn Phe Leu Val Gln Arg Lys 160 165 170 175 gtg ctg agg ccc ctt cgg acc cag aca gga gga acc ata atg gcg ggg 638 Val Leu Arg Pro Leu Arg Thr Gln Thr Gly Gly Thr Ile Met Ala Gly 180 185 190 aag ctg gct gtg gag cga ggc tgg gcc atc aac gtg ggg ggt ggc ttc 686 Lys Leu Ala Val Glu Arg Gly Trp Ala Ile Asn Val Gly Gly Gly Phe 195 200 205 cac cac tgc tcc agc gac cgt ggc ggg ggc ttc tgt gcc tat gcg gac 734 His His Cys Ser Ser Asp Arg Gly Gly Gly Phe Cys Ala Tyr Ala Asp 210 215 220 atc acg ctc gcc atc aag ttt ctg ttt gag cgt gtg gag ggc atc tcc 782 Ile Thr Leu Ala Ile Lys Phe Leu Phe Glu Arg Val Glu Gly Ile Ser 225 230 235 agg gct acc atc att gat ctt gat gcc cat cag ggc aat ggg cat gag 830 Arg Ala Thr Ile Ile Asp Leu Asp Ala His Gln Gly Asn Gly His Glu 240 245 250 255 cga gac ttc atg gac gac aag cgt gtg tac atc atg gat gtc tac aac 878 Arg Asp Phe Met Asp Asp Lys Arg Val Tyr Ile Met Asp Val Tyr Asn 260 265 270 cgc cac atc tac cca ggg gac cgc ttt gcc aag cag gcc atc agg cgg 926 Arg His Ile Tyr Pro Gly Asp Arg Phe Ala Lys Gln Ala Ile Arg Arg 275 280 285 aag gtg gag ctg gag tgg ggc aca gag gat gat gag tac ctg gat aag 974 Lys Val Glu Leu Glu Trp Gly Thr Glu Asp Asp Glu Tyr Leu Asp Lys 290 295 300 gtg gag agg aac atc aag aaa tcc ctc cag gag cac ctg ccc gac gtg 1022 Val Glu Arg Asn Ile Lys Lys Ser Leu Gln Glu His Leu Pro Asp Val 305 310 315 gtg gta tac aat gca ggc acc gac atc ctc gag ggg gac cgc ctt ggg 1070 Val Val Tyr Asn Ala Gly Thr Asp Ile Leu Glu Gly Asp Arg Leu Gly 320 325 330 335 ggg ctg tcc atc agc cca gcg ggc atc gtg aag cgg gat gag ctg gtg 1118 Gly Leu Ser Ile Ser Pro Ala Gly Ile Val Lys Arg Asp Glu Leu Val 340 345 350 ttc cgg atg gtc cgt ggc cgc cgg gtg ccc atc ctt atg gtg acc tca 1166 Phe Arg Met Val Arg Gly Arg Arg Val Pro Ile Leu Met Val Thr Ser 355 360 365 ggc ggg tac cag aag cgc aca gcc cgc atc att gct gac tcc ata ctt 1214 Gly Gly Tyr Gln Lys Arg Thr Ala Arg Ile Ile Ala Asp Ser Ile Leu 370 375 380 aat ctg ttt ggc ctg ggg ctc att ggg cct gag tca ccc agc gtc tcc 1262 Asn Leu Phe Gly Leu Gly Leu Ile Gly Pro Glu Ser Pro Ser Val Ser 385 390 395 gca cag aac tca gac aca ccg ctg ctt ccc cct gca gtg ccc 1304 Ala Gln Asn Ser Asp Thr Pro Leu Leu Pro Pro Ala Val Pro 400 405 410 tgacccttgc tgccctgcct gtcacgtggc cctgcctatc cgccccttag tgctttttgt 1364 tttctaacct catggggtgg tggaggcagc cttcagtgag catggagggg cagggccatc 1424 cctggctggg gcctggagct ggcccttcct ctacttttcc ctgctggaag ccagaagggc 1484 ttgaggcctc tatgggtggg ggcagaaggc agagcctgtg tcccaggggg acccacacga 1544 agtcaccagc ccataggtcc agggaggcag gcagg 1579 42 413 PRT Homo sapiens 42 Met Gly Val Gly Gly Gly Leu Pro Ala Gly Arg Leu Ser Val Ser Gln 1 5 10 15 His Trp Arg Gly Gly Pro Asn Arg Pro Glu Leu Cys Pro Gln Arg Thr 20 25 30 Arg Leu Ser Pro Gly Ala Cys Pro Gly Leu Ala Pro Pro Ala Pro Arg 35 40 45 Pro Ala Pro Pro Gly Arg Gly Ala Ala Ala Ser Phe Gly Arg Ala Gly 50 55 60 Pro Gly Met Leu His Thr Thr Gln Leu Tyr Gln His Val Pro Glu Thr 65 70 75 80 Arg Trp Pro Ile Val Tyr Ser Pro Arg Tyr Asn Ile Thr Phe Met Gly 85 90 95 Leu Glu Lys Leu His Pro Phe Asp Ala Gly Lys Trp Gly Lys Val Ile 100 105 110 Asn Phe Leu Lys Glu Glu Lys Leu Leu Ser Asp Ser Met Leu Val Glu 115 120 125 Ala Arg Glu Ala Ser Glu Glu Asp Leu Leu Val Val His Thr Arg Arg 130 135 140 Tyr Leu Asn Glu Leu Lys Trp Ser Phe Ala Val Ala Thr Ile Thr Glu 145 150 155 160 Ile Pro Pro Val Ile Phe Leu Pro Asn Phe Leu Val Gln Arg Lys Val 165 170 175 Leu Arg Pro Leu Arg Thr Gln Thr Gly Gly Thr Ile Met Ala Gly Lys 180 185 190 Leu Ala Val Glu Arg Gly Trp Ala Ile Asn Val Gly Gly Gly Phe His 195 200 205 His Cys Ser Ser Asp Arg Gly Gly Gly Phe Cys Ala Tyr Ala Asp Ile 210 215 220 Thr Leu Ala Ile Lys Phe Leu Phe Glu Arg Val Glu Gly Ile Ser Arg 225 230 235 240 Ala Thr Ile Ile Asp Leu Asp Ala His Gln Gly Asn Gly His Glu Arg 245 250 255 Asp Phe Met Asp Asp Lys Arg Val Tyr Ile Met Asp Val Tyr Asn Arg 260 265 270 His Ile Tyr Pro Gly Asp Arg Phe Ala Lys Gln Ala Ile Arg Arg Lys 275 280 285 Val Glu Leu Glu Trp Gly Thr Glu Asp Asp Glu Tyr Leu Asp Lys Val 290 295 300 Glu Arg Asn Ile Lys Lys Ser Leu Gln Glu His Leu Pro Asp Val Val 305 310 315 320 Val Tyr Asn Ala Gly Thr Asp Ile Leu Glu Gly Asp Arg Leu Gly Gly 325 330 335 Leu Ser Ile Ser Pro Ala Gly Ile Val Lys Arg Asp Glu Leu Val Phe 340 345 350 Arg Met Val Arg Gly Arg Arg Val Pro Ile Leu Met Val Thr Ser Gly 355 360 365 Gly Tyr Gln Lys Arg Thr Ala Arg Ile Ile Ala Asp Ser Ile Leu Asn 370 375 380 Leu Phe Gly Leu Gly Leu Ile Gly Pro Glu Ser Pro Ser Val Ser Ala 385 390 395 400 Gln Asn Ser Asp Thr Pro Leu Leu Pro Pro Ala Val Pro 405 410 43 1242 DNA Homo sapiens 43 atgggcgttg ggggtggcct tcctgcgggc aggctctctg tgtcgcaaca ctggcggggc 60 gggccaaatc ggccagagct ctgcccccag aggacgcggc taagcccggg ggcgtgtcct 120 gggctggccc cacccgcgcc ccgccccgcc ccgcccggtc gcggagctgc ggccagcttt 180 gggagggccg gccccgggat gctacacaca acccagctgt accagcatgt gccagagaca 240 cgctggccaa tcgtgtactc gccgcgctac aacatcacct tcatgggcct ggagaagctg 300 catccctttg atgccggaaa atggggcaaa gtgatcaatt tcctaaaaga agagaagctt 360 ctgtctgaca gcatgctggt ggaggcgcgg gaggcctcgg aggaggacct gctggtggtg 420 cacacgaggc gctatcttaa tgagctcaag tggtcctttg ctgttgctac catcacagaa 480 atcccccccg ttatcttcct ccccaacttc cttgtgcaga ggaaggtgct gaggcccctt 540 cggacccaga caggaggaac cataatggcg gggaagctgg ctgtggagcg aggctgggcc 600 atcaacgtgg ggggtggctt ccaccactgc tccagcgacc gtggcggggg cttctgtgcc 660 tatgcggaca tcacgctcgc catcaagttt ctgtttgagc gtgtggaggg catctccagg 720 gctaccatca ttgatcttga tgcccatcag ggcaatgggc atgagcgaga cttcatggac 780 gacaagcgtg tgtacatcat ggatgtctac aaccgccaca tctacccagg ggaccgcttt 840 gccaagcagg ccatcaggcg gaaggtggag ctggagtggg gcacagagga tgatgagtac 900 ctggataagg tggagaggaa catcaagaaa tccctccagg agcacctgcc cgacgtggtg 960 gtatacaatg caggcaccga catcctcgag ggggaccgcc ttggggggct gtccatcagc 1020 ccagcgggca tcgtgaagcg ggatgagctg gtgttccgga tggtccgtgg ccgccgggtg 1080 cccatcctta tggtgacctc aggcgggtac cagaagcgca cagcccgcat cattgctgac 1140 tccatactta atctgtttgg cctggggctc attgggcctg agtcacccag cgtctccgca 1200 cagaactcag acacaccgct gcttccccct gcagtgccct ga 1242 44 342 PRT Artificial Sequence consensus sequence 44 Gly Tyr Val Tyr Asp Pro Glu Val Leu Asn His Glu Cys Lys Ile Ser 1 5 10 15 Tyr Gly Ala Thr His Pro Glu Asn Pro Glu Arg Leu Arg Leu Ile His 20 25 30 Glu Leu Leu Leu Glu Tyr Gly Leu Leu Lys Lys Met Glu Ile Val Thr 35 40 45 Asn Pro Arg Lys Ala Thr Asp Glu Glu Leu Leu Leu Val His Ser Glu 50 55 60 Asp Tyr Val Glu Phe Leu Glu Ser Leu Ser Lys Thr Asn Leu Glu Glu 65 70 75 80 Leu Glu Lys Gly Thr Asp Lys Ile Leu Leu Glu Ile Glu Leu Lys Tyr 85 90 95 Phe Asn Lys Gly Asp Asp Thr Pro Val Phe Ala Gly Leu Tyr Glu Ala 100 105 110 Ala Arg Leu Ala Val Gly Gly Ser Leu Glu Leu Ala Asp Arg Leu Leu 115 120 125 Glu Gly Glu Leu Asp Asn Ala Phe Asn Trp Ala Gly Gly Pro Gly His 130 135 140 His Ala Lys Lys Gly Glu Ala Ser Gly Phe Cys Tyr Phe Asn Asn Val 145 150 155 160 Ala Ile Ala Ile Lys Tyr Leu Leu Lys Lys Tyr Pro Leu Tyr Val Lys 165 170 175 Arg Val Leu Ile Ile Asp Phe Asp Val His His Gly Asp Gly Thr Gln 180 185 190 Glu Ile Phe Tyr Asp Asp Asp Arg Val Leu Thr Val Ser Phe His Lys 195 200 205 Tyr Gly Lys Gly Glu Phe Phe Pro Gly Thr Gly Asp Ile Thr Glu Ile 210 215 220 Gly Lys Gly Lys Gly Lys Gly Tyr Thr Leu Asn Ile Pro Leu Asn Glu 225 230 235 240 Asp Gly Thr Asp Asp Glu Ser Tyr Leu Ser Ala Phe Lys His Val Ile 245 250 255 Glu Pro Val Leu Glu Gln Phe Lys Pro Asp Ala Ile Val Ile Ser Ala 260 265 270 Gly Phe Asp Ala Leu Tyr Gly Asp Pro Thr Gln Leu Gly Ser Phe Asn 275 280 285 Leu Thr Ile Glu Gly Tyr Gly Glu Met Val Arg Phe Leu Lys Ser Leu 290 295 300 Ala Gln Lys His Cys Asp Gly Pro Leu Leu Val Val Leu Glu Gly Gly 305 310 315 320 Tyr Thr Leu Arg Ala Ile Ala Asn Val Ala Arg Cys Trp Ile Ala Leu 325 330 335 Thr Gly Gly Leu Leu Gly 340 45 45 PRT Artificial Sequence consensus sequence 45 Thr Gln Leu Tyr Phe His Val Pro Glu Thr Pro Trp Pro Ile Ile Tyr 1 5 10 15 Ser Pro Arg Tyr Asn Ile Thr Phe Met Gly Ile Glu Lys Leu His Pro 20 25 30 Phe Asp Ala Gly Lys Trp Gly Arg Val Cys Asn Phe Leu 35 40 45 46 150 PRT Artificial Sequence consensus sequence 46 Leu Leu Asp Asp Ser Glu Ile Tyr Arg Pro Arg Lys Ala Thr Glu Glu 1 5 10 15 Glu Leu Thr Arg Phe His Ser Glu Glu Tyr Ile Asp Phe Leu Arg Ser 20 25 30 Val Thr Pro Asp Asn Met Gln Glu Glu Tyr Ser Lys Gln Met Glu Arg 35 40 45 Phe Asn Gly Leu Val Gly Glu Asp Asp Asp Cys Pro Val Phe Asp Gly 50 55 60 Leu Tyr Glu Phe Cys Arg Leu Ala Ala Gly Gly Ser Ile Glu Ala Ala 65 70 75 80 Glu Lys Val Met Glu Gly Glu Ala Asp Asn Gly Phe Ala Asn Trp Arg 85 90 95 Pro Pro Gly His His Ala Lys Lys Ser Glu Ala Ser Gly Phe Cys Tyr 100 105 110 Phe Asn Asp Val Ala Ile Ala Val Lys His Leu Leu Lys Arg Arg His 115 120 125 Gly Val Lys Arg Val Leu Ile Ile Asp Trp Asp Val His His Gly Asn 130 135 140 Gly Thr Gln Glu Ile Phe 145 150 47 122 PRT Artificial Sequence consensus sequence 47 Gly Asn Gly Thr Ala Arg Leu Phe Thr Asp Asp Pro Ala Val Tyr Thr 1 5 10 15 Ile Phe Thr Tyr Asn Met His Cys Tyr Pro Asn Tyr Pro Phe Arg Lys 20 25 30 Gln Ala Ser Arg Met Asp Val Gly Leu Glu Asn Gly Thr Glu Asp Asp 35 40 45 Glu Tyr Leu Gln Val Leu Glu Arg His Ile Glu Gln Ser Leu Asn Glu 50 55 60 His Arg Pro Asp Leu Val Ile Tyr Asn Ala Gly Thr Asp Val Leu Glu 65 70 75 80 Gly Asp Arg Leu Gly Asn Leu Ala Ile Ser Pro Ala Gly Ile Val Lys 85 90 95 Arg Asp Arg Leu Val Phe Arg Met Ala Arg Ala Ala Gly Val Pro Ile 100 105 110 Val Cys Val Ile Gly Gly Gly Tyr Gln Lys 115 120 48 1391 DNA Homo sapiens CDS (9)...(1271) 48 gctccaag atg tca gca acg ctg atc ctg gag ccc cca ggc cgc tgc tgc 50 Met Ser Ala Thr Leu Ile Leu Glu Pro Pro Gly Arg Cys Cys 1 5 10 tgg aac gag ccg gtg cgc att gcc gtg cgc ggc ctg gcc ccg gag cag 98 Trp Asn Glu Pro Val Arg Ile Ala Val Arg Gly Leu Ala Pro Glu Gln 15 20 25 30 cgg gtt acg ctg cgc gcg tcc ctg cgc gac gag aag ggc gcg ctc ttc 146 Arg Val Thr Leu Arg Ala Ser Leu Arg Asp Glu Lys Gly Ala Leu Phe 35 40 45 cgg gcc cac gcg cgc tac tgc gcc gac gcc tgc ggc gag ctg gac ctg 194 Arg Ala His Ala Arg Tyr Cys Ala Asp Ala Cys Gly Glu Leu Asp Leu 50 55 60 gag cgc gca ccc gcg ctg ggc ggc agc ttc gcg gga ctc gag ccc atg 242 Glu Arg Ala Pro Ala Leu Gly Gly Ser Phe Ala Gly Leu Glu Pro Met 65 70 75 ggg ctg ctc tgg gcc ctg gaa ccc gag aag cct ttt tgg cgc ttc ctg 290 Gly Leu Leu Trp Ala Leu Glu Pro Glu Lys Pro Phe Trp Arg Phe Leu 80 85 90 aag cgg gac gta cag att cct ttt gtc gtg gag ttg gag gtg ctg gac 338 Lys Arg Asp Val Gln Ile Pro Phe Val Val Glu Leu Glu Val Leu Asp 95 100 105 110 ggc cac gac ccc gag cct gga cgg ctg ctg tgc cag gcg cag cac gag 386 Gly His Asp Pro Glu Pro Gly Arg Leu Leu Cys Gln Ala Gln His Glu 115 120 125 cgc cac ttc ctc ccg cca ggg gtg cgg cgc cag tcg gtg cga gcg ggc 434 Arg His Phe Leu Pro Pro Gly Val Arg Arg Gln Ser Val Arg Ala Gly 130 135 140 cgg gtg cgc gcc acg ctc ttc ctg ccg cca gga cct gga ccc ttc cca 482 Arg Val Arg Ala Thr Leu Phe Leu Pro Pro Gly Pro Gly Pro Phe Pro 145 150 155 ggg atc att gac atc ttt ggt att gga ggg ggc ctc ttg gaa tat cga 530 Gly Ile Ile Asp Ile Phe Gly Ile Gly Gly Gly Leu Leu Glu Tyr Arg 160 165 170 gcc agc ctc ctt gct ggc cat ggc ttt gcc acg ttg gct cta gct tat 578 Ala Ser Leu Leu Ala Gly His Gly Phe Ala Thr Leu Ala Leu Ala Tyr 175 180 185 190 tat aac ttt gaa gat ctc ccc aat aac atg gac aac ata tcc ctg gag 626 Tyr Asn Phe Glu Asp Leu Pro Asn Asn Met Asp Asn Ile Ser Leu Glu 195 200 205 tac ttc gaa gaa gcc gta tgc tac atg ctt caa cat ccc cag gta aaa 674 Tyr Phe Glu Glu Ala Val Cys Tyr Met Leu Gln His Pro Gln Val Lys 210 215 220 ggc cca ggc att ggg ctt ttg ggc att tct cta gga gct gat att tgt 722 Gly Pro Gly Ile Gly Leu Leu Gly Ile Ser Leu Gly Ala Asp Ile Cys 225 230 235 ctc tca atg gcc tca ttc ttg aag aat gtc tca gcc aca gtt tcc atc 770 Leu Ser Met Ala Ser Phe Leu Lys Asn Val Ser Ala Thr Val Ser Ile 240 245 250 aat gga tct ggg atc agt ggg aac aca gcc atc aac tat aag cac agt 818 Asn Gly Ser Gly Ile Ser Gly Asn Thr Ala Ile Asn Tyr Lys His Ser 255 260 265 270 agc att cca cca ttg ggc tat gac ctg agg aga atc aag gta gct ttc 866 Ser Ile Pro Pro Leu Gly Tyr Asp Leu Arg Arg Ile Lys Val Ala Phe 275 280 285 tca ggc ctc gtg gac atc gtg gat ata agg aat gct ctc gta gga ggg 914 Ser Gly Leu Val Asp Ile Val Asp Ile Arg Asn Ala Leu Val Gly Gly 290 295 300 tac aag aac ccc agc atg att cca ata gag aag gcc cag ggg ccc atc 962 Tyr Lys Asn Pro Ser Met Ile Pro Ile Glu Lys Ala Gln Gly Pro Ile 305 310 315 ctg ctc att gtt ggt cag gat gac cat aac tgg aga agt gag ttg tat 1010 Leu Leu Ile Val Gly Gln Asp Asp His Asn Trp Arg Ser Glu Leu Tyr 320 325 330 gcc caa aca gtc tct gaa cgg tta cag gcc cat gga aag gaa aaa ccc 1058 Ala Gln Thr Val Ser Glu Arg Leu Gln Ala His Gly Lys Glu Lys Pro 335 340 345 350 cag atc atc tgt tac cct ggg act ggg cat tac atc gag cct cct tac 1106 Gln Ile Ile Cys Tyr Pro Gly Thr Gly His Tyr Ile Glu Pro Pro Tyr 355 360 365 ttc ccc ctg tgc cca gct tcc ctt cac aga tta ctg aac aaa cat gtt 1154 Phe Pro Leu Cys Pro Ala Ser Leu His Arg Leu Leu Asn Lys His Val 370 375 380 ata tgg ggt ggg gag ccc agg gct cat tct aag gcc cag gaa gat gcc 1202 Ile Trp Gly Gly Glu Pro Arg Ala His Ser Lys Ala Gln Glu Asp Ala 385 390 395 tgg aag caa att cta gcc ttc ttc tgc aaa cac ctg gga ggt acc cag 1250 Trp Lys Gln Ile Leu Ala Phe Phe Cys Lys His Leu Gly Gly Thr Gln 400 405 410 aaa aca gct gtc cct aaa ttg taatgcattt gtctgttgtt gacatgagag 1301 Lys Thr Ala Val Pro Lys Leu 415 420 attcaagatc agattctagt gttcagtaac cctatgtgaa tcagatgtct cctggataac 1361 attaaagcca tgtctttgtc attaaaaaaa 1391 49 421 PRT Homo sapiens 49 Met Ser Ala Thr Leu Ile Leu Glu Pro Pro Gly Arg Cys Cys Trp Asn 1 5 10 15 Glu Pro Val Arg Ile Ala Val Arg Gly Leu Ala Pro Glu Gln Arg Val 20 25 30 Thr Leu Arg Ala Ser Leu Arg Asp Glu Lys Gly Ala Leu Phe Arg Ala 35 40 45 His Ala Arg Tyr Cys Ala Asp Ala Cys Gly Glu Leu Asp Leu Glu Arg 50 55 60 Ala Pro Ala Leu Gly Gly Ser Phe Ala Gly Leu Glu Pro Met Gly Leu 65 70 75 80 Leu Trp Ala Leu Glu Pro Glu Lys Pro Phe Trp Arg Phe Leu Lys Arg 85 90 95 Asp Val Gln Ile Pro Phe Val Val Glu Leu Glu Val Leu Asp Gly His 100 105 110 Asp Pro Glu Pro Gly Arg Leu Leu Cys Gln Ala Gln His Glu Arg His 115 120 125 Phe Leu Pro Pro Gly Val Arg Arg Gln Ser Val Arg Ala Gly Arg Val 130 135 140 Arg Ala Thr Leu Phe Leu Pro Pro Gly Pro Gly Pro Phe Pro Gly Ile 145 150 155 160 Ile Asp Ile Phe Gly Ile Gly Gly Gly Leu Leu Glu Tyr Arg Ala Ser 165 170 175 Leu Leu Ala Gly His Gly Phe Ala Thr Leu Ala Leu Ala Tyr Tyr Asn 180 185 190 Phe Glu Asp Leu Pro Asn Asn Met Asp Asn Ile Ser Leu Glu Tyr Phe 195 200 205 Glu Glu Ala Val Cys Tyr Met Leu Gln His Pro Gln Val Lys Gly Pro 210 215 220 Gly Ile Gly Leu Leu Gly Ile Ser Leu Gly Ala Asp Ile Cys Leu Ser 225 230 235 240 Met Ala Ser Phe Leu Lys Asn Val Ser Ala Thr Val Ser Ile Asn Gly 245 250 255 Ser Gly Ile Ser Gly Asn Thr Ala Ile Asn Tyr Lys His Ser Ser Ile 260 265 270 Pro Pro Leu Gly Tyr Asp Leu Arg Arg Ile Lys Val Ala Phe Ser Gly 275 280 285 Leu Val Asp Ile Val Asp Ile Arg Asn Ala Leu Val Gly Gly Tyr Lys 290 295 300 Asn Pro Ser Met Ile Pro Ile Glu Lys Ala Gln Gly Pro Ile Leu Leu 305 310 315 320 Ile Val Gly Gln Asp Asp His Asn Trp Arg Ser Glu Leu Tyr Ala Gln 325 330 335 Thr Val Ser Glu Arg Leu Gln Ala His Gly Lys Glu Lys Pro Gln Ile 340 345 350 Ile Cys Tyr Pro Gly Thr Gly His Tyr Ile Glu Pro Pro Tyr Phe Pro 355 360 365 Leu Cys Pro Ala Ser Leu His Arg Leu Leu Asn Lys His Val Ile Trp 370 375 380 Gly Gly Glu Pro Arg Ala His Ser Lys Ala Gln Glu Asp Ala Trp Lys 385 390 395 400 Gln Ile Leu Ala Phe Phe Cys Lys His Leu Gly Gly Thr Gln Lys Thr 405 410 415 Ala Val Pro Lys Leu 420 50 1266 DNA Homo sapiens 50 atgtcagcaa cgctgatcct ggagccccca ggccgctgct gctggaacga gccggtgcgc 60 attgccgtgc gcggcctggc cccggagcag cgggttacgc tgcgcgcgtc cctgcgcgac 120 gagaagggcg cgctcttccg ggcccacgcg cgctactgcg ccgacgcctg cggcgagctg 180 gacctggagc gcgcacccgc gctgggcggc agcttcgcgg gactcgagcc catggggctg 240 ctctgggccc tggaacccga gaagcctttt tggcgcttcc tgaagcggga cgtacagatt 300 ccttttgtcg tggagttgga ggtgctggac ggccacgacc ccgagcctgg acggctgctg 360 tgccaggcgc agcacgagcg ccacttcctc ccgccagggg tgcggcgcca gtcggtgcga 420 gcgggccggg tgcgcgccac gctcttcctg ccgccaggac ctggaccctt cccagggatc 480 attgacatct ttggtattgg agggggcctc ttggaatatc gagccagcct ccttgctggc 540 catggctttg ccacgttggc tctagcttat tataactttg aagatctccc caataacatg 600 gacaacatat ccctggagta cttcgaagaa gccgtatgct acatgcttca acatccccag 660 gtaaaaggcc caggcattgg gcttttgggc atttctctag gagctgatat ttgtctctca 720 atggcctcat tcttgaagaa tgtctcagcc acagtttcca tcaatggatc tgggatcagt 780 gggaacacag ccatcaacta taagcacagt agcattccac cattgggcta tgacctgagg 840 agaatcaagg tagctttctc aggcctcgtg gacatcgtgg atataaggaa tgctctcgta 900 ggagggtaca agaaccccag catgattcca atagagaagg cccaggggcc catcctgctc 960 attgttggtc aggatgacca taactggaga agtgagttgt atgcccaaac agtctctgaa 1020 cggttacagg cccatggaaa ggaaaaaccc cagatcatct gttaccctgg gactgggcat 1080 tacatcgagc ctccttactt ccccctgtgc ccagcttccc ttcacagatt actgaacaaa 1140 catgttatat ggggtgggga gcccagggct cattctaagg cccaggaaga tgcctggaag 1200 caaattctag ccttcttctg caaacacctg ggaggtaccc agaaaacagc tgtccctaaa 1260 ttgtaa 1266 51 423 PRT Artificial Sequence consensus sequence 51 Met Ala Ala Thr Leu Leu Thr Ala Pro Pro Val Asp Ser Leu Gln Asp 1 5 10 15 Glu Pro Val His Ile Ala Val Thr Gly Leu Ala Pro Glu Gln Pro Tyr 20 25 30 Thr Phe Arg Ala Ser Leu Arg Asp Glu Lys Gly Thr Leu Phe Arg Ser 35 40 45 His Ala Arg Tyr Arg Ala Asp Ser Val Gly Glu Ile Asp Leu Glu Arg 50 55 60 Ala Pro Pro Leu Gly Gly Ser Tyr Ser Gly Val Asp Pro Met Gly Leu 65 70 75 80 Phe Trp Ser Met Glu Pro Thr Glu Lys Leu Leu Trp Arg Leu Tyr Lys 85 90 95 Arg Asp Val Pro Pro Thr Pro Phe Tyr Tyr Glu Leu Glu Leu Leu Asp 100 105 110 Gly His Glu Pro Ile Val Asp Arg Val Leu Ala Gln Thr Val His Ser 115 120 125 Leu Thr Leu Glu Arg His Trp Met Ala Pro Gly Val Arg Arg Val Pro 130 135 140 Val Arg Glu Gly Arg Ile Arg Gly Thr Leu Phe Leu Pro Pro Gly Glu 145 150 155 160 Gly Pro Phe Pro Gly Val Ile Asp Ile Phe Gly Thr Gly Gly Gly Leu 165 170 175 Leu Glu Tyr Arg Ala Ser Leu Leu Ala Ser Lys Gly Phe Ala Thr Leu 180 185 190 Cys Leu Ala Tyr Tyr Asn Tyr Glu Asp Leu Pro Lys Lys Met Glu Asp 195 200 205 Val Asp Leu Glu Tyr Phe Glu Glu Ala Val Asn Tyr Leu Leu Asn His 210 215 220 Pro Tyr Val Lys Gly Pro Gly Ile Gly Ile Leu Gly Val Ser Phe Gly 225 230 235 240 Gly Glu Ile Gly Leu Ser Met Ala Thr Arg Leu Lys Gln Ile Thr Ala 245 250 255 Ala Val Ile Ile Asn Gly Pro His Ala Ala Cys Gly Asn Thr Leu Leu 260 265 270 Tyr Lys His Glu Thr Tyr Pro Pro Val Gln Ile Leu Asp Asp Gly Val 275 280 285 Lys Trp Phe Leu Asn Gly Leu Leu Glu Tyr Val Pro Ala Phe Arg Asp 290 295 300 Leu Tyr Ser Asn Leu Asp Glu Lys Ser Ser Ile Pro Trp Glu Arg Ala 305 310 315 320 Pro Lys Glu Thr Ala Phe Arg Phe Ile Val Gly Gln Asp Asp His Asn 325 330 335 Trp Lys Ser Glu Phe Tyr Ala Asn Glu Ile Cys Lys Arg Leu Gln Lys 340 345 350 His Gly His His Lys Glu Gln Ile Ile Cys Tyr Pro Asn Gly Gly His 355 360 365 Tyr Ile Glu Pro Pro Tyr Phe Pro His His Glu Ala Val Tyr His Ala 370 375 380 Phe Val Gly Phe Tyr Cys Gly Trp Gly Gly Glu Pro Val Leu His Ala 385 390 395 400 Lys Ser Gln Glu Asp Thr Trp Lys Gln Thr Val Thr Phe Phe His Lys 405 410 415 His Leu Gly Asn Pro Lys Lys 420 52 415 PRT Homo sapiens VARIANT (1)...(415) Xaa = Any Amino Acid 52 Met Ser Ala Thr Leu Ile Leu Glu Pro Pro Gly Arg Cys Cys Trp Asn 1 5 10 15 Glu Pro Val Arg Ile Ala Val Arg Gly Leu Ala Pro Glu Gln Arg Val 20 25 30 Thr Leu Arg Ala Ser Leu Arg Asp Glu Lys Gly Ala Leu Phe Arg Ala 35 40 45 His Ala Arg Tyr Cys Ala Asp Ala Cys Gly Glu Leu Asp Leu Glu Arg 50 55 60 Ala Pro Ala Leu Gly Gly Ser Phe Ala Gly Leu Glu Pro Met Gly Leu 65 70 75 80 Leu Trp Ala Leu Glu Pro Glu Lys Pro Phe Trp Arg Phe Leu Lys Arg 85 90 95 Asp Val Gln Ile Pro Phe Val Val Glu Leu Glu Val Leu Asp Gly His 100 105 110 Asp Pro Glu Pro Gly Arg Leu Leu Cys Gln Ala Gln His Glu Arg His 115 120 125 Phe Leu Pro Pro Gly Val Arg Arg Gln Ser Val Arg Ala Gly Arg Val 130 135 140 Arg Ala Thr Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 145 150 155 160 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Glu Tyr Arg Ala Ser 165 170 175 Leu Leu Ala Gly His Gly Phe Ala Thr Leu Ala Leu Ala Tyr Tyr Asn 180 185 190 Phe Glu Asp Leu Pro Asn Asn Met Asp Asn Ile Ser Leu Glu Tyr Phe 195 200 205 Glu Glu Ala Val Cys Tyr Met Leu Gln His Pro Gln Val Lys Xaa Xaa 210 215 220 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cys Leu Ser 225 230 235 240 Met Ala Ser Phe Leu Lys Asn Val Ser Ala Thr Val Ser Ile Asn Gly 245 250 255 Ser Gly Ile Ser Gly Asn Thr Ala Ile Asn Tyr Lys His Ser Ser Ile 260 265 270 Pro Pro Leu Gly Tyr Asp Leu Arg Arg Ile Lys Val Ala Phe Ser Gly 275 280 285 Leu Val Asp Ile Val Asp Ile Arg Asn Ala Leu Val Gly Gly Tyr Lys 290 295 300 Asn Pro Ser Met Ile Pro Ile Glu Lys Ala Gln Gly Pro Ile Leu Leu 305 310 315 320 Ile Val Gly Gln Asp Asp His Asn Trp Arg Ser Glu Leu Tyr Ala Gln 325 330 335 Thr Val Ser Glu Arg Leu Gln Ala His Gly Lys Glu Lys Pro Gln Ile 340 345 350 Ile Cys Tyr Pro Gly Thr Gly His Tyr Ile Glu Pro Pro Tyr Phe Pro 355 360 365 Leu Cys Pro Ala Ser Leu His Arg Leu Leu Asn Lys His Val Ile Trp 370 375 380 Gly Gly Glu Pro Arg Ala His Ser Lys Ala Gln Glu Asp Ala Trp Lys 385 390 395 400 Gln Ile Leu Ala Phe Phe Cys Lys His Leu Gly Gly Thr Gln Lys 405 410 415 53 4667 DNA Homo sapiens CDS (420)...(2924) 53 ggcacgaggc aacttggtct gaattccagg tcactaacca cttgtctctt ctgtttcccc 60 attcctttct gtctgcccca tccaatttcc tttgccctct tccacctctg tatttttctg 120 tctgtccgtc tgtctgtatc ctgcctccct gcccctctcg ctccaccccc cgcaggtcgg 180 gcctgccttc accttctccc acttccttcc ccttccccac cccgtgcccc ctccatggag 240 aggaacagac cccttctctg tccagtctaa cccaggtccc tccccaaccc cctcctccct 300 cctttccccc cgcccctcct ccctcctggg gcgagggggg cctccctccc tctccccccc 360 ttctctctct ctccgagggg ggggggtccc agggagggag ggggggtccc ccgatcagc 419 atg tgg ctc ctg gcg ctg tgt ctg gtg ggg ctg gcg ggg gct caa cgc 467 Met Trp Leu Leu Ala Leu Cys Leu Val Gly Leu Ala Gly Ala Gln Arg 1 5 10 15 ggg gga ggg ggt ccc ggc ggc ggc gcc ccg ggc ggc ccc ggc ctg ggc 515 Gly Gly Gly Gly Pro Gly Gly Gly Ala Pro Gly Gly Pro Gly Leu Gly 20 25 30 ctc ggc agc ctc ggc gag gag cgc ttc ccg gtg gtg aac acg gcc tac 563 Leu Gly Ser Leu Gly Glu Glu Arg Phe Pro Val Val Asn Thr Ala Tyr 35 40 45 ggg cga gtg cgc ggt gtg cgg cgc gag ctc aac aac gag atc ctg ggc 611 Gly Arg Val Arg Gly Val Arg Arg Glu Leu Asn Asn Glu Ile Leu Gly 50 55 60 ccc gtc gtg cag ttc ttg ggc gtg ccc tac gcc acg ccg ccc ctg ggc 659 Pro Val Val Gln Phe Leu Gly Val Pro Tyr Ala Thr Pro Pro Leu Gly 65 70 75 80 gcc cgc cgc ttc cag ccg cct gag gcg ccc gcc tcg tgg ccc ggc gtg 707 Ala Arg Arg Phe Gln Pro Pro Glu Ala Pro Ala Ser Trp Pro Gly Val 85 90 95 cgc aac gcc acc acc ctg ccg ccc gcc tgc ccg cag aac ctg cac ggg 755 Arg Asn Ala Thr Thr Leu Pro Pro Ala Cys Pro Gln Asn Leu His Gly 100 105 110 gcg ctg ccc gcc atc atg ctg cct gtg tgg ttc acc gac aac ttg gag 803 Ala Leu Pro Ala Ile Met Leu Pro Val Trp Phe Thr Asp Asn Leu Glu 115 120 125 gcg gcc gcc acc tac gtg cag aac cag agc gag gac tgc ctg tac ctc 851 Ala Ala Ala Thr Tyr Val Gln Asn Gln Ser Glu Asp Cys Leu Tyr Leu 130 135 140 aac ctc tac gtg ccc acc gag gac ggt ccg ctc aca aaa aaa cgt gac 899 Asn Leu Tyr Val Pro Thr Glu Asp Gly Pro Leu Thr Lys Lys Arg Asp 145 150 155 160 gag gcg acg ctc aat ccg cca gac aca gat atc cgt gac cct ggg aag 947 Glu Ala Thr Leu Asn Pro Pro Asp Thr Asp Ile Arg Asp Pro Gly Lys 165 170 175 aag cct gtg atg ctg ttt ctc cat ggc ggc tcc tac atg gag ggg acc 995 Lys Pro Val Met Leu Phe Leu His Gly Gly Ser Tyr Met Glu Gly Thr 180 185 190 gga aac atg ttc gat ggc tca gtc ctg gct gcc tat ggc aac gtc att 1043 Gly Asn Met Phe Asp Gly Ser Val Leu Ala Ala Tyr Gly Asn Val Ile 195 200 205 gta gcc acg ctc aac tac cgt ctt ggg gtg ctc ggt ttt ctc agc acc 1091 Val Ala Thr Leu Asn Tyr Arg Leu Gly Val Leu Gly Phe Leu Ser Thr 210 215 220 ggg gac cag gct gca aaa ggc aac tat ggg ctc ctg gac cag atc cag 1139 Gly Asp Gln Ala Ala Lys Gly Asn Tyr Gly Leu Leu Asp Gln Ile Gln 225 230 235 240 gcc ctg cgc tgg ctc agt gaa aac atc gcc cac ttt ggg ggc gac ccc 1187 Ala Leu Arg Trp Leu Ser Glu Asn Ile Ala His Phe Gly Gly Asp Pro 245 250 255 gag cgt atc acc atc ttt ggt tcc ggg gca ggg gcc tcc tgc gtc aac 1235 Glu Arg Ile Thr Ile Phe Gly Ser Gly Ala Gly Ala Ser Cys Val Asn 260 265 270 ctt ctg atc ctc tcc cac cat tca gaa ggg ctg ttc cag aag gcc atc 1283 Leu Leu Ile Leu Ser His His Ser Glu Gly Leu Phe Gln Lys Ala Ile 275 280 285 gcc cag agt ggc acc gcc att tcc agc tgg tct gtc aac tac cag ccg 1331 Ala Gln Ser Gly Thr Ala Ile Ser Ser Trp Ser Val Asn Tyr Gln Pro 290 295 300 ctc aag tac acg cgg ctg ctg gca gcc aag gtg ggc tgt gac cga gag 1379 Leu Lys Tyr Thr Arg Leu Leu Ala Ala Lys Val Gly Cys Asp Arg Glu 305 310 315 320 gac agt gct gaa gct gtg gag tgt ctg cgc cgg aag ccc tcc cgg gag 1427 Asp Ser Ala Glu Ala Val Glu Cys Leu Arg Arg Lys Pro Ser Arg Glu 325 330 335 ctg gtg gac cag gac gtg cag cct gcc cgc tac cac atc gcc ttt ggg 1475 Leu Val Asp Gln Asp Val Gln Pro Ala Arg Tyr His Ile Ala Phe Gly 340 345 350 ccc gtg gtg gat ggc gac gtg gtc ccc gat gac cct gag atc ctc atg 1523 Pro Val Val Asp Gly Asp Val Val Pro Asp Asp Pro Glu Ile Leu Met 355 360 365 cag cag gga gaa ttc ctc aac tac gac atg ctc atc ggc gtc aac cag 1571 Gln Gln Gly Glu Phe Leu Asn Tyr Asp Met Leu Ile Gly Val Asn Gln 370 375 380 gga gag ggc ctc aag ttc gtg gag gac tct gca gag agc gag gac ggt 1619 Gly Glu Gly Leu Lys Phe Val Glu Asp Ser Ala Glu Ser Glu Asp Gly 385 390 395 400 gtg tct gcc agc gcc ttt gac ttc act gtc tcc aac ttt gtg gac aac 1667 Val Ser Ala Ser Ala Phe Asp Phe Thr Val Ser Asn Phe Val Asp Asn 405 410 415 ctg tat ggc tac ccg gaa ggc aag gat gtg ctt cgg gag acc atc aag 1715 Leu Tyr Gly Tyr Pro Glu Gly Lys Asp Val Leu Arg Glu Thr Ile Lys 420 425 430 ttt atg tac aca gac tgg gcc gac cgg gac aat ggc gaa atg cgc cgc 1763 Phe Met Tyr Thr Asp Trp Ala Asp Arg Asp Asn Gly Glu Met Arg Arg 435 440 445 aaa acc ctg ctg gcg ctc ttt act gac cac caa tgg gtg gca cca gct 1811 Lys Thr Leu Leu Ala Leu Phe Thr Asp His Gln Trp Val Ala Pro Ala 450 455 460 gtg gcc act gcc aag ctg cac gcc gac tac cag tct ccc gtc tac ttt 1859 Val Ala Thr Ala Lys Leu His Ala Asp Tyr Gln Ser Pro Val Tyr Phe 465 470 475 480 tac acc ttc tac cac cac tgc cag gcg gag ggc cgg cct gag tgg gca 1907 Tyr Thr Phe Tyr His His Cys Gln Ala Glu Gly Arg Pro Glu Trp Ala 485 490 495 gat gcg gcg cac ggg gat gaa ctg ccc tat gtc ttt ggc gtg ccc atg 1955 Asp Ala Ala His Gly Asp Glu Leu Pro Tyr Val Phe Gly Val Pro Met 500 505 510 gtg ggt gcc acc gac ctc ttc ccc tgt aac ttc tcc aag aat gac gtc 2003 Val Gly Ala Thr Asp Leu Phe Pro Cys Asn Phe Ser Lys Asn Asp Val 515 520 525 atg ctc agt gcc gtg gtc atg acc tac tgg acc aac ttc gcc aag act 2051 Met Leu Ser Ala Val Val Met Thr Tyr Trp Thr Asn Phe Ala Lys Thr 530 535 540 ggg gac ccc aac cag ccg gtg ccg cag gat acc aag ttc atc cac acc 2099 Gly Asp Pro Asn Gln Pro Val Pro Gln Asp Thr Lys Phe Ile His Thr 545 550 555 560 aag ccc aat cgc ttc gag gag gtg gtg tgg agc aaa ttc aac agc aag 2147 Lys Pro Asn Arg Phe Glu Glu Val Val Trp Ser Lys Phe Asn Ser Lys 565 570 575 gag aag cag tat ctg cac ata ggc ctg aag cca cgc gtg cgt gac aac 2195 Glu Lys Gln Tyr Leu His Ile Gly Leu Lys Pro Arg Val Arg Asp Asn 580 585 590 tac cgc gcc aac aag gtg gcc ttc tgg ctg gag ctc gtg ccc cac ctg 2243 Tyr Arg Ala Asn Lys Val Ala Phe Trp Leu Glu Leu Val Pro His Leu 595 600 605 cac aac ctg cac acg gag ctc ttc acc acc acc acg cgc ctg cct ccc 2291 His Asn Leu His Thr Glu Leu Phe Thr Thr Thr Thr Arg Leu Pro Pro 610 615 620 tac gcc acg cgc tgg ccg cct cgt ccc ccc gct ggc gcc ccg ggc aca 2339 Tyr Ala Thr Arg Trp Pro Pro Arg Pro Pro Ala Gly Ala Pro Gly Thr 625 630 635 640 cgc cgg ccc ccg ccg cct gcc acc ctg cct ccc gag ccc gag ccc gag 2387 Arg Arg Pro Pro Pro Pro Ala Thr Leu Pro Pro Glu Pro Glu Pro Glu 645 650 655 ccc ggc cca agg gcc tat gac cgc ttc ccc ggg gac tca cgg gac tac 2435 Pro Gly Pro Arg Ala Tyr Asp Arg Phe Pro Gly Asp Ser Arg Asp Tyr 660 665 670 tcc acg gag ctg agc gtc acc gtg gcc gtg ggt gcc tcc ctc ctc ttc 2483 Ser Thr Glu Leu Ser Val Thr Val Ala Val Gly Ala Ser Leu Leu Phe 675 680 685 ctc aac atc ctg gcc ttt gct gcc ctc tac tac aag cgg gac cgg cgg 2531 Leu Asn Ile Leu Ala Phe Ala Ala Leu Tyr Tyr Lys Arg Asp Arg Arg 690 695 700 cag gag ctg cgg tgc agg cgg ctt agc cca cct ggc ggc tca ggc tct 2579 Gln Glu Leu Arg Cys Arg Arg Leu Ser Pro Pro Gly Gly Ser Gly Ser 705 710 715 720 ggc gtg cct ggt ggg ggc ccc ctg ctc ccc gcc gcg ggc cgt gag ctg 2627 Gly Val Pro Gly Gly Gly Pro Leu Leu Pro Ala Ala Gly Arg Glu Leu 725 730 735 cca cca gag gag gag ctg gtg tca ctg cag ctg aag cgg ggt ggt ggc 2675 Pro Pro Glu Glu Glu Leu Val Ser Leu Gln Leu Lys Arg Gly Gly Gly 740 745 750 gtc ggg gcg gac cct gcc gag gct ctg cgc cct gcc tgc ccg ccc gac 2723 Val Gly Ala Asp Pro Ala Glu Ala Leu Arg Pro Ala Cys Pro Pro Asp 755 760 765 tac acc ctg gcc ctg cgc cgg gca ccg gac gat gtg cct ctc ttg gcc 2771 Tyr Thr Leu Ala Leu Arg Arg Ala Pro Asp Asp Val Pro Leu Leu Ala 770 775 780 ccc ggg gcc ctg acc ctg ctg ccc agt ggc ctg ggg cca ccg cca ccc 2819 Pro Gly Ala Leu Thr Leu Leu Pro Ser Gly Leu Gly Pro Pro Pro Pro 785 790 795 800 cca ccg ccc ccc tcc ctt cat ccc ttc ggg ccc ttc ccc ccg ccc cct 2867 Pro Pro Pro Pro Ser Leu His Pro Phe Gly Pro Phe Pro Pro Pro Pro 805 810 815 ccc acc gcc acc agc cac aac aac acg cta ccc cac ccc cac tcc acc 2915 Pro Thr Ala Thr Ser His Asn Asn Thr Leu Pro His Pro His Ser Thr 820 825 830 act cgg gta tagggggtgg gtggggaggc cctcctcccc ggccctccct 2964 Thr Arg Val 835 ggcccggcca ctccgaaggc agggaggagg acttggcaac tggcttttct cctgtggagt 3024 cgtcacacgc catccagcag cgctaaggtg gacatgggat tcctccctgc gatgcgtgtc 3084 tttcccacgc agagaagccc cagtctcttc tctggatctg ggcctttgaa caactggggg 3144 gcgttttctc ccccccattg ggacaccagt cttcggtgtg tggaatgtgg tattttcccg 3204 cgtggaggtg tgctttctca caacggggtg tgttttccca tgtgcagggt gaggtttttt 3264 tttgccaccc tggacacatg ttggccccct caaagaattt ctgtggggat ttgtacccca 3324 gaatcctgtt cccccatccc ttctcccacc tcctcccctc tccctccccc tggagaccct 3384 ggaagtggtg tgttcacata cagtgaccct tggccaccag accacagagg atggagcctg 3444 ggaagcagcg aggaaatcac agccccctcg cccctgcctc ccttgcccct accccggcga 3504 agcatgttcc ccccgacgcc ccccttggca caagtcagat gaagcacgtt ctgccgggga 3564 ggccctcacc ttccagagag gacagacaca gatttcctgc tgggggaggg aggagtccac 3624 gcatcctgat gctgcctgga agcttatttt cccgtggcca ggacgcattt ctctgagtgg 3684 aaacaggttc ttgcatgtgg atgtgtgttt ccccaggcag acggcccctc tcttcccagc 3744 acttccctgc ctcccccagg cctcaggccc agcacccagt tcctcctcac atggcaggtg 3804 agcacagact tctagttggc aggagctgag gagggtgaac aaaccccgag ggaggcccgg 3864 cccttgctcc cgagttgggg ggagggggtg tggcaacgtg ccccccgcag aggccacgca 3924 tgtttgacca aagccctcat tgtggtccga ggacagcctt ttccccaggc ctcagagcat 3984 tgctcatccg tgccaaactg ggtaggtgga tttgagcgga aagactccca aaatgtgcca 4044 agaatttccc agtcccaggc agggcagggg aaactaaggg caagcaggat acagggcgag 4104 ggatgtggca ggtgaggggg ctcccgcctg tgccccttct cctcaccatg tctcccccac 4164 cctgcctcag ttctccgttc cccttcatct ccgtccccct ctttgaagct gtccccatct 4224 cagtgtcaga ccagccttct cctcatctga ccaccctcct ctgaccgacg ccccctcctt 4284 gtctgaaaga aaggagcctt gaatggtgga gggaggcagt ggggagaaag gtctcaccgg 4344 acaggttggg agaatgaggt cagcggtgct ggggaacaga tggagggggc agtggggaca 4404 gggcttgggc agacaccagc aggaataatt tgaaatgtgt gaggtgactc cccggagggc 4464 cttgggcttg ggcatttggg aaaagaatga tgtctggaag ggcttaaggg acacagtgga 4524 cgaggggaga gtcctcatct gctggcattt tgtggggtgt tagtgccaaa cttgaatagg 4584 ggctggggtg ctgtcttcca ctgacaccca aatccagaat ccctggtctt gagtcccaga 4644 actttgcctc ttgactgtcc ctc 4667 54 835 PRT Homo sapiens 54 Met Trp Leu Leu Ala Leu Cys Leu Val Gly Leu Ala Gly Ala Gln Arg 1 5 10 15 Gly Gly Gly Gly Pro Gly Gly Gly Ala Pro Gly Gly Pro Gly Leu Gly 20 25 30 Leu Gly Ser Leu Gly Glu Glu Arg Phe Pro Val Val Asn Thr Ala Tyr 35 40 45 Gly Arg Val Arg Gly Val Arg Arg Glu Leu Asn Asn Glu Ile Leu Gly 50 55 60 Pro Val Val Gln Phe Leu Gly Val Pro Tyr Ala Thr Pro Pro Leu Gly 65 70 75 80 Ala Arg Arg Phe Gln Pro Pro Glu Ala Pro Ala Ser Trp Pro Gly Val 85 90 95 Arg Asn Ala Thr Thr Leu Pro Pro Ala Cys Pro Gln Asn Leu His Gly 100 105 110 Ala Leu Pro Ala Ile Met Leu Pro Val Trp Phe Thr Asp Asn Leu Glu 115 120 125 Ala Ala Ala Thr Tyr Val Gln Asn Gln Ser Glu Asp Cys Leu Tyr Leu 130 135 140 Asn Leu Tyr Val Pro Thr Glu Asp Gly Pro Leu Thr Lys Lys Arg Asp 145 150 155 160 Glu Ala Thr Leu Asn Pro Pro Asp Thr Asp Ile Arg Asp Pro Gly Lys 165 170 175 Lys Pro Val Met Leu Phe Leu His Gly Gly Ser Tyr Met Glu Gly Thr 180 185 190 Gly Asn Met Phe Asp Gly Ser Val Leu Ala Ala Tyr Gly Asn Val Ile 195 200 205 Val Ala Thr Leu Asn Tyr Arg Leu Gly Val Leu Gly Phe Leu Ser Thr 210 215 220 Gly Asp Gln Ala Ala Lys Gly Asn Tyr Gly Leu Leu Asp Gln Ile Gln 225 230 235 240 Ala Leu Arg Trp Leu Ser Glu Asn Ile Ala His Phe Gly Gly Asp Pro 245 250 255 Glu Arg Ile Thr Ile Phe Gly Ser Gly Ala Gly Ala Ser Cys Val Asn 260 265 270 Leu Leu Ile Leu Ser His His Ser Glu Gly Leu Phe Gln Lys Ala Ile 275 280 285 Ala Gln Ser Gly Thr Ala Ile Ser Ser Trp Ser Val Asn Tyr Gln Pro 290 295 300 Leu Lys Tyr Thr Arg Leu Leu Ala Ala Lys Val Gly Cys Asp Arg Glu 305 310 315 320 Asp Ser Ala Glu Ala Val Glu Cys Leu Arg Arg Lys Pro Ser Arg Glu 325 330 335 Leu Val Asp Gln Asp Val Gln Pro Ala Arg Tyr His Ile Ala Phe Gly 340 345 350 Pro Val Val Asp Gly Asp Val Val Pro Asp Asp Pro Glu Ile Leu Met 355 360 365 Gln Gln Gly Glu Phe Leu Asn Tyr Asp Met Leu Ile Gly Val Asn Gln 370 375 380 Gly Glu Gly Leu Lys Phe Val Glu Asp Ser Ala Glu Ser Glu Asp Gly 385 390 395 400 Val Ser Ala Ser Ala Phe Asp Phe Thr Val Ser Asn Phe Val Asp Asn 405 410 415 Leu Tyr Gly Tyr Pro Glu Gly Lys Asp Val Leu Arg Glu Thr Ile Lys 420 425 430 Phe Met Tyr Thr Asp Trp Ala Asp Arg Asp Asn Gly Glu Met Arg Arg 435 440 445 Lys Thr Leu Leu Ala Leu Phe Thr Asp His Gln Trp Val Ala Pro Ala 450 455 460 Val Ala Thr Ala Lys Leu His Ala Asp Tyr Gln Ser Pro Val Tyr Phe 465 470 475 480 Tyr Thr Phe Tyr His His Cys Gln Ala Glu Gly Arg Pro Glu Trp Ala 485 490 495 Asp Ala Ala His Gly Asp Glu Leu Pro Tyr Val Phe Gly Val Pro Met 500 505 510 Val Gly Ala Thr Asp Leu Phe Pro Cys Asn Phe Ser Lys Asn Asp Val 515 520 525 Met Leu Ser Ala Val Val Met Thr Tyr Trp Thr Asn Phe Ala Lys Thr 530 535 540 Gly Asp Pro Asn Gln Pro Val Pro Gln Asp Thr Lys Phe Ile His Thr 545 550 555 560 Lys Pro Asn Arg Phe Glu Glu Val Val Trp Ser Lys Phe Asn Ser Lys 565 570 575 Glu Lys Gln Tyr Leu His Ile Gly Leu Lys Pro Arg Val Arg Asp Asn 580 585 590 Tyr Arg Ala Asn Lys Val Ala Phe Trp Leu Glu Leu Val Pro His Leu 595 600 605 His Asn Leu His Thr Glu Leu Phe Thr Thr Thr Thr Arg Leu Pro Pro 610 615 620 Tyr Ala Thr Arg Trp Pro Pro Arg Pro Pro Ala Gly Ala Pro Gly Thr 625 630 635 640 Arg Arg Pro Pro Pro Pro Ala Thr Leu Pro Pro Glu Pro Glu Pro Glu 645 650 655 Pro Gly Pro Arg Ala Tyr Asp Arg Phe Pro Gly Asp Ser Arg Asp Tyr 660 665 670 Ser Thr Glu Leu Ser Val Thr Val Ala Val Gly Ala Ser Leu Leu Phe 675 680 685 Leu Asn Ile Leu Ala Phe Ala Ala Leu Tyr Tyr Lys Arg Asp Arg Arg 690 695 700 Gln Glu Leu Arg Cys Arg Arg Leu Ser Pro Pro Gly Gly Ser Gly Ser 705 710 715 720 Gly Val Pro Gly Gly Gly Pro Leu Leu Pro Ala Ala Gly Arg Glu Leu 725 730 735 Pro Pro Glu Glu Glu Leu Val Ser Leu Gln Leu Lys Arg Gly Gly Gly 740 745 750 Val Gly Ala Asp Pro Ala Glu Ala Leu Arg Pro Ala Cys Pro Pro Asp 755 760 765 Tyr Thr Leu Ala Leu Arg Arg Ala Pro Asp Asp Val Pro Leu Leu Ala 770 775 780 Pro Gly Ala Leu Thr Leu Leu Pro Ser Gly Leu Gly Pro Pro Pro Pro 785 790 795 800 Pro Pro Pro Pro Ser Leu His Pro Phe Gly Pro Phe Pro Pro Pro Pro 805 810 815 Pro Thr Ala Thr Ser His Asn Asn Thr Leu Pro His Pro His Ser Thr 820 825 830 Thr Arg Val 835 55 2508 DNA Homo sapiens 55 atgtggctcc tggcgctgtg tctggtgggg ctggcggggg ctcaacgcgg gggagggggt 60 cccggcggcg gcgccccggg cggccccggc ctgggcctcg gcagcctcgg cgaggagcgc 120 ttcccggtgg tgaacacggc ctacgggcga gtgcgcggtg tgcggcgcga gctcaacaac 180 gagatcctgg gccccgtcgt gcagttcttg ggcgtgccct acgccacgcc gcccctgggc 240 gcccgccgct tccagccgcc tgaggcgccc gcctcgtggc ccggcgtgcg caacgccacc 300 accctgccgc ccgcctgccc gcagaacctg cacggggcgc tgcccgccat catgctgcct 360 gtgtggttca ccgacaactt ggaggcggcc gccacctacg tgcagaacca gagcgaggac 420 tgcctgtacc tcaacctcta cgtgcccacc gaggacggtc cgctcacaaa aaaacgtgac 480 gaggcgacgc tcaatccgcc agacacagat atccgtgacc ctgggaagaa gcctgtgatg 540 ctgtttctcc atggcggctc ctacatggag gggaccggaa acatgttcga tggctcagtc 600 ctggctgcct atggcaacgt cattgtagcc acgctcaact accgtcttgg ggtgctcggt 660 tttctcagca ccggggacca ggctgcaaaa ggcaactatg ggctcctgga ccagatccag 720 gccctgcgct ggctcagtga aaacatcgcc cactttgggg gcgaccccga gcgtatcacc 780 atctttggtt ccggggcagg ggcctcctgc gtcaaccttc tgatcctctc ccaccattca 840 gaagggctgt tccagaaggc catcgcccag agtggcaccg ccatttccag ctggtctgtc 900 aactaccagc cgctcaagta cacgcggctg ctggcagcca aggtgggctg tgaccgagag 960 gacagtgctg aagctgtgga gtgtctgcgc cggaagccct cccgggagct ggtggaccag 1020 gacgtgcagc ctgcccgcta ccacatcgcc tttgggcccg tggtggatgg cgacgtggtc 1080 cccgatgacc ctgagatcct catgcagcag ggagaattcc tcaactacga catgctcatc 1140 ggcgtcaacc agggagaggg cctcaagttc gtggaggact ctgcagagag cgaggacggt 1200 gtgtctgcca gcgcctttga cttcactgtc tccaactttg tggacaacct gtatggctac 1260 ccggaaggca aggatgtgct tcgggagacc atcaagttta tgtacacaga ctgggccgac 1320 cgggacaatg gcgaaatgcg ccgcaaaacc ctgctggcgc tctttactga ccaccaatgg 1380 gtggcaccag ctgtggccac tgccaagctg cacgccgact accagtctcc cgtctacttt 1440 tacaccttct accaccactg ccaggcggag ggccggcctg agtgggcaga tgcggcgcac 1500 ggggatgaac tgccctatgt ctttggcgtg cccatggtgg gtgccaccga cctcttcccc 1560 tgtaacttct ccaagaatga cgtcatgctc agtgccgtgg tcatgaccta ctggaccaac 1620 ttcgccaaga ctggggaccc caaccagccg gtgccgcagg ataccaagtt catccacacc 1680 aagcccaatc gcttcgagga ggtggtgtgg agcaaattca acagcaagga gaagcagtat 1740 ctgcacatag gcctgaagcc acgcgtgcgt gacaactacc gcgccaacaa ggtggccttc 1800 tggctggagc tcgtgcccca cctgcacaac ctgcacacgg agctcttcac caccaccacg 1860 cgcctgcctc cctacgccac gcgctggccg cctcgtcccc ccgctggcgc cccgggcaca 1920 cgccggcccc cgccgcctgc caccctgcct cccgagcccg agcccgagcc cggcccaagg 1980 gcctatgacc gcttccccgg ggactcacgg gactactcca cggagctgag cgtcaccgtg 2040 gccgtgggtg cctccctcct cttcctcaac atcctggcct ttgctgccct ctactacaag 2100 cgggaccggc ggcaggagct gcggtgcagg cggcttagcc cacctggcgg ctcaggctct 2160 ggcgtgcctg gtgggggccc cctgctcccc gccgcgggcc gtgagctgcc accagaggag 2220 gagctggtgt cactgcagct gaagcggggt ggtggcgtcg gggcggaccc tgccgaggct 2280 ctgcgccctg cctgcccgcc cgactacacc ctggccctgc gccgggcacc ggacgatgtg 2340 cctctcttgg cccccggggc cctgaccctg ctgcccagtg gcctggggcc accgccaccc 2400 ccaccgcccc cctcccttca tcccttcggg cccttccccc cgccccctcc caccgccacc 2460 agccacaaca acacgctacc ccacccccac tccaccactc gggtatag 2508 56 585 PRT Artificial Sequence consensus sequence 56 Leu Leu Val Ala Thr Asn Asn Val Leu Cys Gly Lys Val Arg Gly Val 1 5 10 15 Asn Glu Lys Thr Asp Asn Gly Glu Gln Ser Val Tyr Ser Phe Leu Gly 20 25 30 Ile Pro Tyr Ala Glu Pro Pro Val Gly Asn Leu Arg Phe Lys Ala Pro 35 40 45 Gln Pro Tyr Lys Glu Pro Trp Ser Asp Val Leu Asp Ala Thr Lys Tyr 50 55 60 Pro Pro Ser Cys Leu Gln Asp Asp Asp Phe Gly Phe Ser Leu Ser Asp 65 70 75 80 Leu Lys Val Ala Leu Lys Met Leu Ser Leu Gly Trp Asn Lys Leu Val 85 90 95 Gly Leu Lys Leu Ser Glu Asp Cys Leu Tyr Leu Asn Val Tyr Thr Pro 100 105 110 Lys Asn Thr Lys Pro Asn Ser Lys Leu Pro Val Met Val Trp Ile His 115 120 125 Gly Gly Gly Phe Met Phe Gly Ser Gly His Ser Leu Pro Leu Ser Leu 130 135 140 Tyr Asp Gly Glu Ser Leu Ala Arg Glu Gly Asn Val Ile Val Val Ser 145 150 155 160 Ile Asn Tyr Arg Leu Gly Pro Leu Gly Phe Leu Ser Thr Gly Asp Asp 165 170 175 Lys Leu Pro Gly Ser Gly Asn Tyr Gly Leu Leu Leu Asp Gln Arg Leu 180 185 190 Ala Leu Lys Trp Val Gln Asp Asn Ile Ala Ala Phe Gly Gly Asp Pro 195 200 205 Asn Ser Val Thr Ile Phe Gly Glu Ser Ala Gly Ala Ala Ser Val Ser 210 215 220 Leu Leu Leu Leu Ser Asn Gly Gly Asp Asn Pro Pro Ser Ser Lys Gly 225 230 235 240 Leu Phe His Arg Ala Ile Ser Gln Ser Gly Ser Ala Leu Ser Pro Trp 245 250 255 Ala Ile Gln Ser Glu Ser Asn Ala Arg Gly Arg Ala Lys Glu Leu Ala 260 265 270 Arg Leu Leu Gly Cys Asn Glu Thr Ser Ser Ser Glu Leu Leu Asp Cys 275 280 285 Leu Arg Ser Lys Ser Ala Glu Glu Leu Leu Glu Ala Thr Arg Ser Phe 290 295 300 Leu Leu Phe Glu Tyr Val Pro Phe Leu Pro Leu Phe Leu Ala Phe Gly 305 310 315 320 Pro Val Val Asp Gly Asp Asp Ala Pro Glu Ala Phe Ile Pro Glu Asp 325 330 335 Pro Glu Glu Leu Ile Lys Glu Gly Lys Phe Ala Asp Val Pro Tyr Leu 340 345 350 Ile Gly Val Thr Lys Asp Glu Gly Gly Tyr Phe Ala Ala Met Leu Leu 355 360 365 Asn Ala Ser Ser Lys Gly Glu Asp Glu Leu Lys Lys Glu Thr Asn Pro 370 375 380 Asp Val Trp Leu Glu Leu Leu Lys Tyr Leu Leu Phe Tyr Ala Ser Glu 385 390 395 400 Ala Leu Asn Ile Lys Asp Met Asp Asp Leu Ala Asp Lys Val Leu Glu 405 410 415 Lys Tyr Pro Gly Asp Val Asp Asp Phe Ser Val Glu Ser Arg Lys Pro 420 425 430 Asn Leu Gln Asp Met Leu Thr Asp Leu Leu Phe Lys Cys Pro Thr Arg 435 440 445 Val Ala Ala Asp Leu His Ala Lys His Gly Gly Ser Pro Val Tyr Ala 450 455 460 Tyr Val Phe Asp His Pro Ala Ser Phe Gly Ile Gly Gln Phe Leu Ala 465 470 475 480 Lys Arg Val Asp Pro Glu Phe Gly Gly Ala Val His Gly Asp Glu Ile 485 490 495 Phe Phe Val Phe Gly Asn Pro Leu Leu Lys Glu Gln Leu Tyr Lys Ala 500 505 510 Thr Glu Glu Glu Glu Lys Ser Ser Ser Lys Thr Met Met Asn Tyr Trp 515 520 525 Ala Asn Phe Ala Lys Thr Gly Asn Pro Asn Asn Gly Thr Ser Asn Gly 530 535 540 Leu Val Val Trp Pro Lys Tyr Thr Ser Glu Glu Gln Lys Tyr Ser Leu 545 550 555 560 Leu Ile Leu Leu Thr Thr Ile Thr Ala Gln Lys Leu Lys Ala Arg Asp 565 570 575 Pro Arg Lys Val Leu Cys Asn Phe Trp 580 585 57 836 PRT Rattus norvegicus 57 Met Trp Leu Leu Ala Leu Cys Leu Val Gly Leu Ala Gly Ala Gln Arg 1 5 10 15 Gly Gly Gly Gly Pro Gly Gly Gly Ala Pro Gly Gly Pro Gly Leu Gly 20 25 30 Leu Gly Ser Leu Gly Glu Glu Arg Phe Pro Val Val Asn Thr Ala Tyr 35 40 45 Gly Arg Val Arg Gly Val Arg Arg Glu Leu Asn Asn Glu Ile Leu Gly 50 55 60 Pro Val Val Gln Phe Leu Gly Val Pro Tyr Ala Thr Pro Pro Leu Gly 65 70 75 80 Ala Arg Arg Phe Gln Pro Pro Glu Ala Pro Ala Ser Trp Pro Gly Val 85 90 95 Arg Asn Ala Thr Thr Leu Pro Pro Ala Cys Pro Gln Asn Leu His Gly 100 105 110 Ala Leu Pro Ala Ile Met Leu Pro Val Trp Phe Thr Asp Asn Leu Glu 115 120 125 Ala Ala Ala Thr Tyr Val Gln Asn Gln Ser Glu Asp Cys Leu Tyr Leu 130 135 140 Asn Leu Tyr Val Pro Thr Glu Asp Gly Pro Leu Thr Lys Lys Arg Asp 145 150 155 160 Glu Ala Thr Leu Asn Pro Pro Asp Thr Asp Ile Arg Asp Ser Gly Lys 165 170 175 Lys Pro Val Met Leu Phe Leu His Gly Gly Ser Tyr Met Glu Gly Thr 180 185 190 Gly Asn Met Phe Asp Gly Ser Val Leu Ala Ala Tyr Gly Asn Val Ile 195 200 205 Val Ala Thr Leu Asn Tyr Arg Leu Gly Val Leu Gly Phe Leu Ser Thr 210 215 220 Gly Asp Gln Ala Ala Lys Gly Asn Tyr Gly Leu Leu Asp Gln Ile Gln 225 230 235 240 Ala Leu Arg Trp Leu Ser Glu Asn Ile Ala His Phe Gly Gly Asp Pro 245 250 255 Glu Arg Ile Thr Ile Phe Gly Ser Gly Ala Gly Ala Ser Cys Val Asn 260 265 270 Leu Leu Ile Leu Ser His His Ser Glu Gly Leu Phe Gln Lys Ala Ile 275 280 285 Ala Gln Ser Gly Thr Ala Ile Ser Ser Trp Ser Val Asn Tyr Gln Pro 290 295 300 Leu Lys Tyr Thr Arg Leu Leu Ala Ala Lys Val Gly Cys Asp Arg Glu 305 310 315 320 Asp Ser Thr Glu Ala Val Glu Cys Leu Arg Arg Lys Ser Ser Arg Glu 325 330 335 Leu Val Asp Gln Asp Val Gln Pro Ala Arg Tyr His Ile Ala Phe Gly 340 345 350 Pro Val Val Asp Gly Asp Val Val Pro Asp Asp Pro Glu Ile Leu Met 355 360 365 Gln Gln Gly Glu Phe Leu Asn Tyr Asp Met Leu Ile Gly Val Asn Gln 370 375 380 Gly Glu Gly Leu Lys Phe Val Glu Asp Ser Ala Glu Ser Glu Asp Gly 385 390 395 400 Val Ser Ala Ser Ala Phe Asp Phe Thr Val Ser Asn Phe Val Asp Asn 405 410 415 Leu Tyr Gly Tyr Pro Glu Gly Lys Asp Val Leu Arg Glu Thr Ile Lys 420 425 430 Phe Met Tyr Thr Asp Trp Ala Asp Arg Asp Asn Gly Glu Met Arg Arg 435 440 445 Lys Thr Leu Leu Ala Leu Phe Thr Asp His Gln Trp Val Ala Pro Ala 450 455 460 Val Ala Thr Ala Lys Leu His Ala Asp Tyr Gln Ser Pro Val Tyr Phe 465 470 475 480 Tyr Thr Phe Tyr His His Cys Gln Ala Glu Gly Arg Pro Glu Trp Ala 485 490 495 Asp Ala Ala His Gly Asp Glu Leu Pro Tyr Val Phe Gly Val Pro Met 500 505 510 Val Gly Ala Thr Asp Leu Phe Pro Cys Asn Phe Ser Lys Asn Asp Val 515 520 525 Met Leu Ser Ala Val Val Met Thr Tyr Trp Thr Asn Phe Ala Lys Thr 530 535 540 Gly Asp Pro Asn Gln Pro Val Pro Gln Asp Thr Lys Phe Ile His Thr 545 550 555 560 Lys Pro Asn Arg Phe Glu Glu Val Val Trp Ser Lys Phe Asn Ser Lys 565 570 575 Glu Lys Gln Tyr Leu His Ile Gly Leu Lys Pro Arg Val Arg Asp Asn 580 585 590 Tyr Arg Ala Asn Lys Val Ala Phe Trp Leu Glu Leu Val Pro His Leu 595 600 605 His Asn Leu His Thr Glu Leu Phe Thr Thr Thr Thr Arg Leu Pro Pro 610 615 620 Tyr Ala Thr Arg Trp Pro Pro Arg Thr Pro Gly Pro Gly Thr Ser Gly 625 630 635 640 Thr Arg Arg Pro Pro Pro Pro Ala Thr Leu Pro Pro Glu Ser Asp Ile 645 650 655 Asp Leu Gly Pro Arg Ala Tyr Asp Arg Phe Pro Gly Asp Ser Arg Asp 660 665 670 Tyr Ser Thr Glu Leu Ser Val Thr Val Ala Val Gly Ala Ser Leu Leu 675 680 685 Phe Leu Asn Ile Leu Ala Phe Ala Ala Leu Tyr Tyr Lys Arg Asp Arg 690 695 700 Arg Gln Glu Leu Arg Cys Arg Arg Leu Ser Pro Pro Gly Gly Ser Gly 705 710 715 720 Ser Gly Val Pro Gly Gly Gly Pro Leu Leu Pro Thr Ala Gly Arg Glu 725 730 735 Leu Pro Pro Glu Glu Glu Leu Val Ser Leu Gln Leu Lys Arg Gly Gly 740 745 750 Gly Val Gly Ala Asp Pro Ala Glu Ala Leu Arg Pro Ala Cys Pro Pro 755 760 765 Asp Tyr Thr Leu Ala Leu Arg Arg Ala Pro Asp Asp Val Pro Leu Leu 770 775 780 Ala Pro Gly Ala Leu Thr Leu Leu Pro Ser Gly Leu Gly Pro Pro Pro 785 790 795 800 Pro Pro Pro Pro Pro Ser Leu His Pro Phe Gly Pro Phe Pro Pro Pro 805 810 815 Pro Pro Thr Ala Thr Ser His Asn Asn Thr Leu Pro His Pro His Ser 820 825 830 Thr Thr Arg Val 835 58 550 PRT Homo sapiens 58 Lys Ala Ile Ala Gln Ser Gly Thr Ala Ile Ser Ser Trp Ser Val Asn 1 5 10 15 Tyr Gln Pro Leu Lys Tyr Thr Arg Leu Leu Ala Ala Lys Val Gly Cys 20 25 30 Asp Arg Glu Asp Ser Ala Glu Ala Val Glu Cys Leu Arg Arg Lys Pro 35 40 45 Ser Arg Glu Leu Val Asp Gln Asp Val Gln Pro Ala Arg Tyr His Ile 50 55 60 Ala Phe Gly Pro Val Val Asp Gly Asp Val Val Pro Asp Asp Pro Glu 65 70 75 80 Ile Leu Met Gln Gln Gly Glu Phe Leu Asn Tyr Asp Met Leu Ile Gly 85 90 95 Val Asn Gln Gly Glu Gly Leu Lys Phe Val Glu Asp Ser Ala Glu Ser 100 105 110 Glu Asp Gly Val Ser Ala Ser Ala Phe Asp Phe Thr Val Ser Asn Phe 115 120 125 Val Asp Asn Leu Tyr Gly Tyr Pro Glu Gly Lys Asp Val Leu Arg Glu 130 135 140 Thr Ile Lys Phe Met Tyr Thr Asp Trp Ala Asp Arg Asp Asn Gly Glu 145 150 155 160 Met Arg Arg Lys Thr Leu Leu Ala Leu Phe Thr Asp His Gln Trp Val 165 170 175 Ala Pro Ala Val Ala Thr Ala Lys Leu His Ala Asp Tyr Gln Ser Pro 180 185 190 Val Tyr Phe Tyr Thr Phe Tyr His His Cys Gln Ala Glu Gly Arg Pro 195 200 205 Glu Trp Ala Asp Ala Ala His Gly Asp Glu Leu Pro Tyr Val Phe Gly 210 215 220 Val Pro Met Val Gly Ala Thr Asp Leu Phe Pro Cys Asn Phe Ser Lys 225 230 235 240 Asn Asp Val Met Leu Ser Ala Val Val Met Thr Tyr Trp Thr Asn Phe 245 250 255 Ala Lys Thr Gly Asp Pro Asn Gln Pro Val Pro Gln Asp Thr Lys Phe 260 265 270 Ile His Thr Lys Pro Asn Arg Phe Glu Glu Val Val Trp Ser Lys Phe 275 280 285 Asn Ser Lys Glu Lys Gln Tyr Leu His Ile Gly Leu Lys Pro Arg Val 290 295 300 Arg Asp Asn Tyr Arg Ala Asn Lys Val Ala Phe Trp Leu Glu Leu Val 305 310 315 320 Pro His Leu His Asn Leu His Thr Glu Leu Phe Thr Thr Thr Thr Arg 325 330 335 Leu Pro Pro Tyr Ala Thr Arg Trp Pro Pro Arg Pro Pro Ala Gly Ala 340 345 350 Pro Gly Thr Arg Arg Pro Pro Pro Pro Ala Thr Leu Pro Pro Glu Pro 355 360 365 Glu Pro Glu Pro Gly Pro Arg Ala Tyr Asp Arg Phe Pro Gly Asp Ser 370 375 380 Arg Asp Tyr Ser Thr Glu Leu Ser Val Thr Val Ala Val Gly Ala Ser 385 390 395 400 Leu Leu Phe Leu Asn Ile Leu Ala Phe Ala Ala Leu Tyr Tyr Lys Arg 405 410 415 Asp Arg Arg Gln Glu Leu Arg Cys Arg Arg Leu Ser Pro Pro Gly Gly 420 425 430 Ser Gly Ser Gly Val Pro Gly Gly Gly Pro Leu Leu Pro Ala Ala Gly 435 440 445 Arg Glu Leu Pro Pro Glu Glu Glu Leu Val Ser Leu Gln Leu Lys Arg 450 455 460 Gly Gly Gly Val Gly Ala Asp Pro Ala Glu Ala Leu Arg Pro Ala Cys 465 470 475 480 Pro Pro Asp Tyr Thr Leu Ala Leu Arg Arg Ala Pro Asp Asp Val Pro 485 490 495 Leu Leu Ala Pro Gly Ala Leu Thr Leu Leu Pro Ser Gly Leu Gly Pro 500 505 510 Pro Pro Pro Pro Pro Pro Pro Ser Leu His Pro Phe Gly Pro Phe Pro 515 520 525 Pro Pro Pro Pro Thr Ala Thr Ser His Asn Asn Thr Leu Pro His Pro 530 535 540 His Ser Thr Thr Arg Val 545 550 59 16 PRT Artificial Sequence exemplary motif 59 Phe Xaa Gly Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Gly Xaa Ser Xaa Gly 1 5 10 15 60 7 PRT Artificial Sequence signature domain 60 Glu Asp Xaa Cys Leu Tyr Xaa 1 5 61 5437 DNA Homo sapiens CDS (248)...(5350) 61 gcactgtgac aagctgcacg ctctagagtc gacccagcag gctctgtgta tgaatgacaa 60 ggataccttc agccagctca ttctggatga atgaatgatt acactaagtg tcctccacat 120 tcctctgtgg gctcacttca tggactcact ttgcgtgctt gttaaatgtg ctgcgttgct 180 cccaagacca tgtaaagcct actgaccact aacctccctc acagcagaaa ctagacgtca 240 ggttaaa atg ggc aac tcc gac agt cag tac acc ctt caa gga tct aaa 289 Met Gly Asn Ser Asp Ser Gln Tyr Thr Leu Gln Gly Ser Lys 1 5 10 aat cat agc aat act att act ggt gct aag caa att cct tgc tcc ctg 337 Asn His Ser Asn Thr Ile Thr Gly Ala Lys Gln Ile Pro Cys Ser Leu 15 20 25 30 aaa ata cgt ggc gtt cat gca aaa gag gaa aag tca ttg cat gga tgg 385 Lys Ile Arg Gly Val His Ala Lys Glu Glu Lys Ser Leu His Gly Trp 35 40 45 ggt cac ggg agc aac gga gca ggt tac aag tcc agg tcc ctg gcc cga 433 Gly His Gly Ser Asn Gly Ala Gly Tyr Lys Ser Arg Ser Leu Ala Arg 50 55 60 agc tgc ctt tct cac ttt aag agt aac cag cct tac gca tcg aga ctc 481 Ser Cys Leu Ser His Phe Lys Ser Asn Gln Pro Tyr Ala Ser Arg Leu 65 70 75 ggt ggc ccc aca tgc aag gtc tcc aga ggt gtt gcc tac tcc acg cac 529 Gly Gly Pro Thr Cys Lys Val Ser Arg Gly Val Ala Tyr Ser Thr His 80 85 90 agg aca aat gcc cca ggg aag gat ttc cag ggc atc agt gct gct ttc 577 Arg Thr Asn Ala Pro Gly Lys Asp Phe Gln Gly Ile Ser Ala Ala Phe 95 100 105 110 tca act gag aat ggc ttc cat tct gtt ggc cac gag ctg gca gat aac 625 Ser Thr Glu Asn Gly Phe His Ser Val Gly His Glu Leu Ala Asp Asn 115 120 125 cac atc acc tcc aga gac tgc aac gga cac ctt ctc aac tgc tac ggg 673 His Ile Thr Ser Arg Asp Cys Asn Gly His Leu Leu Asn Cys Tyr Gly 130 135 140 agg aat gag agc att gcc tcc acc cca ccg ggc gaa gac cgc aag agc 721 Arg Asn Glu Ser Ile Ala Ser Thr Pro Pro Gly Glu Asp Arg Lys Ser 145 150 155 ccc cga gtg ctc atc aaa acg ctg ggg aag ccg gat ggg tgt tta agg 769 Pro Arg Val Leu Ile Lys Thr Leu Gly Lys Pro Asp Gly Cys Leu Arg 160 165 170 gtc gag ttc cac aat ggt ggc aac ccc agc aaa gtg cct gca gag gac 817 Val Glu Phe His Asn Gly Gly Asn Pro Ser Lys Val Pro Ala Glu Asp 175 180 185 190 tgc agt gag ccg gtg cag ctg ctg agg tac tca cct acc tta gca tcg 865 Cys Ser Glu Pro Val Gln Leu Leu Arg Tyr Ser Pro Thr Leu Ala Ser 195 200 205 gaa acc tcc cct gtg cct gaa gcc agg agg ggg tcc agc gcc gat tcc 913 Glu Thr Ser Pro Val Pro Glu Ala Arg Arg Gly Ser Ser Ala Asp Ser 210 215 220 ctg ccc agc cat cgc ccc tct ccc acg gac tct cgc ctg cgg tcc agc 961 Leu Pro Ser His Arg Pro Ser Pro Thr Asp Ser Arg Leu Arg Ser Ser 225 230 235 aaa ggc agc tcc ctg agt tct gag tca tcc tgg tac gac tcc cct tgg 1009 Lys Gly Ser Ser Leu Ser Ser Glu Ser Ser Trp Tyr Asp Ser Pro Trp 240 245 250 ggc aat gct gga gag ctg agc gag gct gag ggc tcc ttc ctg gcc ccc 1057 Gly Asn Ala Gly Glu Leu Ser Glu Ala Glu Gly Ser Phe Leu Ala Pro 255 260 265 270 ggc atg cct gac ccc agt ctc cat gcc agc ttc cca cct ggc gat gcc 1105 Gly Met Pro Asp Pro Ser Leu His Ala Ser Phe Pro Pro Gly Asp Ala 275 280 285 aaa aag cct ttc aac caa agc tct tcc ctc tcc tcc ctc cgg gaa ctg 1153 Lys Lys Pro Phe Asn Gln Ser Ser Ser Leu Ser Ser Leu Arg Glu Leu 290 295 300 tac aaa gat gcc aac ctg ggg agc ctc tcc ccc tca ggt atc cgc ctt 1201 Tyr Lys Asp Ala Asn Leu Gly Ser Leu Ser Pro Ser Gly Ile Arg Leu 305 310 315 tct gat gaa tac atg ggc acg cat gcc agc ctg agc aac cgt gtc tct 1249 Ser Asp Glu Tyr Met Gly Thr His Ala Ser Leu Ser Asn Arg Val Ser 320 325 330 ttt gct tcc gac att gat gtg ccc tcc aga gtg gca cac ggg gac ccc 1297 Phe Ala Ser Asp Ile Asp Val Pro Ser Arg Val Ala His Gly Asp Pro 335 340 345 350 atc cag tac agt tcc ttc act ctc ccc tgt cgg aag ccc aaa gcc ttt 1345 Ile Gln Tyr Ser Ser Phe Thr Leu Pro Cys Arg Lys Pro Lys Ala Phe 355 360 365 gtt gag gat act gcg aag aag gac tcc ctc aaa gcc agg atg cga cgg 1393 Val Glu Asp Thr Ala Lys Lys Asp Ser Leu Lys Ala Arg Met Arg Arg 370 375 380 atc agt gac tgg acg gga agc ctc tca agg aag aaa agg aaa ctc cag 1441 Ile Ser Asp Trp Thr Gly Ser Leu Ser Arg Lys Lys Arg Lys Leu Gln 385 390 395 gag ccg agg tcc aag gag ggc agt gac tac ttt gac agt cgc tct gat 1489 Glu Pro Arg Ser Lys Glu Gly Ser Asp Tyr Phe Asp Ser Arg Ser Asp 400 405 410 gga ctg aat aca gat gtg cag gga tcc tcc cag gca tct gct ttt ctg 1537 Gly Leu Asn Thr Asp Val Gln Gly Ser Ser Gln Ala Ser Ala Phe Leu 415 420 425 430 tgg tca ggg ggc tct act cag atc ctg tct cag aga agt gaa tcc aca 1585 Trp Ser Gly Gly Ser Thr Gln Ile Leu Ser Gln Arg Ser Glu Ser Thr 435 440 445 cat gcg att ggc agc gat ccc ctc cgg cag aac att tat gag aat ttc 1633 His Ala Ile Gly Ser Asp Pro Leu Arg Gln Asn Ile Tyr Glu Asn Phe 450 455 460 atg cga gag ttg gaa atg agc agg acc aac act gag aac ata gaa aca 1681 Met Arg Glu Leu Glu Met Ser Arg Thr Asn Thr Glu Asn Ile Glu Thr 465 470 475 tct aca gaa acc gcc gag tcc agc agc gag tca ctc agc tct ctg gaa 1729 Ser Thr Glu Thr Ala Glu Ser Ser Ser Glu Ser Leu Ser Ser Leu Glu 480 485 490 cag ctg gat ctg ctc ttc gag aag gaa cag ggg gtg gtc cgg agg gcc 1777 Gln Leu Asp Leu Leu Phe Glu Lys Glu Gln Gly Val Val Arg Arg Ala 495 500 505 510 ggg tgg ctc ttc ttc aag ccc ctg gtc act gtg cag aag gaa agg aag 1825 Gly Trp Leu Phe Phe Lys Pro Leu Val Thr Val Gln Lys Glu Arg Lys 515 520 525 ctt gag ctg gtg gca cga agg aaa tgg aaa cag tac tgg gta acg ctg 1873 Leu Glu Leu Val Ala Arg Arg Lys Trp Lys Gln Tyr Trp Val Thr Leu 530 535 540 aaa gga tgc acg ctg ctg ttt tat gag acc tat ggg aag aat tcc atg 1921 Lys Gly Cys Thr Leu Leu Phe Tyr Glu Thr Tyr Gly Lys Asn Ser Met 545 550 555 gat cag agc agt gcc cct cgg tgt gct ctg ttt gca gaa gac agc ata 1969 Asp Gln Ser Ser Ala Pro Arg Cys Ala Leu Phe Ala Glu Asp Ser Ile 560 565 570 gtg cag tct gtt cca gag cat ccc aag aaa gaa aat gtg ttc tgc ctc 2017 Val Gln Ser Val Pro Glu His Pro Lys Lys Glu Asn Val Phe Cys Leu 575 580 585 590 agc aac tcc ttt gga gat gtc tac ctt ttc cag gcc acc agc cag aca 2065 Ser Asn Ser Phe Gly Asp Val Tyr Leu Phe Gln Ala Thr Ser Gln Thr 595 600 605 gat cta gaa aac tgg gtc act gct gta cac tct gct tgt gca tcc ctt 2113 Asp Leu Glu Asn Trp Val Thr Ala Val His Ser Ala Cys Ala Ser Leu 610 615 620 ttt gca aag aag cat ggg aaa gag gac acg ctg cgg ctg ctg aag aac 2161 Phe Ala Lys Lys His Gly Lys Glu Asp Thr Leu Arg Leu Leu Lys Asn 625 630 635 cag acc aaa aac ctg ctt cag aag ata gac atg gac agc aag atg aag 2209 Gln Thr Lys Asn Leu Leu Gln Lys Ile Asp Met Asp Ser Lys Met Lys 640 645 650 aag atg gca gag ctg cag ctg tcc gtg gtg agc gac cca aag aac agg 2257 Lys Met Ala Glu Leu Gln Leu Ser Val Val Ser Asp Pro Lys Asn Arg 655 660 665 670 aaa gcc ata gag aac cag atc cag caa tgg gag cag aat ctt gag aaa 2305 Lys Ala Ile Glu Asn Gln Ile Gln Gln Trp Glu Gln Asn Leu Glu Lys 675 680 685 ttt cac atg gat ctg ttc agg atg cgc tgc tat ctg gcc agc cta caa 2353 Phe His Met Asp Leu Phe Arg Met Arg Cys Tyr Leu Ala Ser Leu Gln 690 695 700 ggt ggg gag tta ccg aac cca aag agt ctc ctt gca gcc gcc agc cgc 2401 Gly Gly Glu Leu Pro Asn Pro Lys Ser Leu Leu Ala Ala Ala Ser Arg 705 710 715 ccc tcc aag ctg gcc ctc ggc agg ctg ggc atc ttg tct gtt tcc tct 2449 Pro Ser Lys Leu Ala Leu Gly Arg Leu Gly Ile Leu Ser Val Ser Ser 720 725 730 ttc cat gct ctg gta tgt tct aga gat gac tct gct ctc cgg aaa agg 2497 Phe His Ala Leu Val Cys Ser Arg Asp Asp Ser Ala Leu Arg Lys Arg 735 740 745 750 aca ctg tca ctg acc cag cga ggg aga aac aag aag gga ata ttt tct 2545 Thr Leu Ser Leu Thr Gln Arg Gly Arg Asn Lys Lys Gly Ile Phe Ser 755 760 765 tcg tta aaa ggg ctg gac aca ctg gcc aga aaa ggc aag gag aag aga 2593 Ser Leu Lys Gly Leu Asp Thr Leu Ala Arg Lys Gly Lys Glu Lys Arg 770 775 780 cct tct ata act cag gtc gat gaa ctt ctg cat ata tat ggt tca aca 2641 Pro Ser Ile Thr Gln Val Asp Glu Leu Leu His Ile Tyr Gly Ser Thr 785 790 795 gtg gac ggt gtt ccc cga gac aat gca tgg gaa atc cag act tat gtt 2689 Val Asp Gly Val Pro Arg Asp Asn Ala Trp Glu Ile Gln Thr Tyr Val 800 805 810 cac ttt cag gac aat cac gga gtt act gta ggg atc aag cca gag cac 2737 His Phe Gln Asp Asn His Gly Val Thr Val Gly Ile Lys Pro Glu His 815 820 825 830 aga gta gaa gat att ttg act ttg gca tgc aag atg agg cag ttg gaa 2785 Arg Val Glu Asp Ile Leu Thr Leu Ala Cys Lys Met Arg Gln Leu Glu 835 840 845 ccc agc cat tat ggc cta cag ctt cga aaa tta gta gat gac aat gtt 2833 Pro Ser His Tyr Gly Leu Gln Leu Arg Lys Leu Val Asp Asp Asn Val 850 855 860 gag tat tgc atc cct gca cca tat gaa tat atg caa caa cag gtt tat 2881 Glu Tyr Cys Ile Pro Ala Pro Tyr Glu Tyr Met Gln Gln Gln Val Tyr 865 870 875 gat gaa ata gaa gtc ttt cca cta aat gtt tat gat gtg cag ctc acg 2929 Asp Glu Ile Glu Val Phe Pro Leu Asn Val Tyr Asp Val Gln Leu Thr 880 885 890 aag act ggg agt gtg tgt gac ttt ggg ttt gca gtt aca gcg cag gtg 2977 Lys Thr Gly Ser Val Cys Asp Phe Gly Phe Ala Val Thr Ala Gln Val 895 900 905 910 gat gag cgt cag cat ctc agc cgg ata ttt ata agc gac gtt ctt ccc 3025 Asp Glu Arg Gln His Leu Ser Arg Ile Phe Ile Ser Asp Val Leu Pro 915 920 925 gat ggc ctg gcg tat ggg gaa ggg ctg aga aag ggc aat gag atc atg 3073 Asp Gly Leu Ala Tyr Gly Glu Gly Leu Arg Lys Gly Asn Glu Ile Met 930 935 940 acc tta aat ggg gaa gct gtg tct gat ctt gac ctt aag cag atg gag 3121 Thr Leu Asn Gly Glu Ala Val Ser Asp Leu Asp Leu Lys Gln Met Glu 945 950 955 gcc ctg ttt tct gag aag agc gtc gga ctc act ctg att gcc cgg cct 3169 Ala Leu Phe Ser Glu Lys Ser Val Gly Leu Thr Leu Ile Ala Arg Pro 960 965 970 ccg gac aca aaa gca acc ctg tgt aca tcc tgg tca gac agt gac ctg 3217 Pro Asp Thr Lys Ala Thr Leu Cys Thr Ser Trp Ser Asp Ser Asp Leu 975 980 985 990 ttc tcc agg gac cag aag agt ctg ctg ccc cct cct aac cag tcc caa 3265 Phe Ser Arg Asp Gln Lys Ser Leu Leu Pro Pro Pro Asn Gln Ser Gln 995 1000 1005 ctg ctg gag gaa ttc ctg gat aac ttt aaa aag aat aca gcc aat gat 3313 Leu Leu Glu Glu Phe Leu Asp Asn Phe Lys Lys Asn Thr Ala Asn Asp 1010 1015 1020 ttc agc aac gtc cct gat atc aca aca ggt ctg aaa agg agt cag aca 3361 Phe Ser Asn Val Pro Asp Ile Thr Thr Gly Leu Lys Arg Ser Gln Thr 1025 1030 1035 gat ggc act ctg gat cag gtt tcc cac agg gag aaa atg gag cag aca 3409 Asp Gly Thr Leu Asp Gln Val Ser His Arg Glu Lys Met Glu Gln Thr 1040 1045 1050 ttc agg agt gct gag cag atc act gca ctg tgc agg agt ttt aac gac 3457 Phe Arg Ser Ala Glu Gln Ile Thr Ala Leu Cys Arg Ser Phe Asn Asp 1055 1060 1065 1070 agt cag gcc aac ggc atg gaa gga ccg cgg gag aat cag gat cct cct 3505 Ser Gln Ala Asn Gly Met Glu Gly Pro Arg Glu Asn Gln Asp Pro Pro 1075 1080 1085 ccg agg cct ctg gcc cgc cac ctg tct gat gca gac cgc ctc cgc aaa 3553 Pro Arg Pro Leu Ala Arg His Leu Ser Asp Ala Asp Arg Leu Arg Lys 1090 1095 1100 gtc atc cag gag ctt gtg gac aca gag aag tcc tac gtg aag gat ttg 3601 Val Ile Gln Glu Leu Val Asp Thr Glu Lys Ser Tyr Val Lys Asp Leu 1105 1110 1115 agc tgc ctc ttt gaa tta tac ttg gag cca ctt cag aat gag acc ttt 3649 Ser Cys Leu Phe Glu Leu Tyr Leu Glu Pro Leu Gln Asn Glu Thr Phe 1120 1125 1130 ctt acc caa gat gag atg gag tca ctt ttt gga agt ttg cca gag atg 3697 Leu Thr Gln Asp Glu Met Glu Ser Leu Phe Gly Ser Leu Pro Glu Met 1135 1140 1145 1150 ctt gag ttt cag aag gtg ttt ctg gag acc ctg gag gat ggg att tca 3745 Leu Glu Phe Gln Lys Val Phe Leu Glu Thr Leu Glu Asp Gly Ile Ser 1155 1160 1165 gca tca tct gac ttt aac acc cta gaa acc ccc tca cag ttt aga aaa 3793 Ala Ser Ser Asp Phe Asn Thr Leu Glu Thr Pro Ser Gln Phe Arg Lys 1170 1175 1180 tta ctg ttt tcc ctt gga ggc tct ttc ctt tat tac gcg gac cac ttt 3841 Leu Leu Phe Ser Leu Gly Gly Ser Phe Leu Tyr Tyr Ala Asp His Phe 1185 1190 1195 aaa ctg tac agt gga ttc tgt gct aac cat atc aaa gta cag aag gtt 3889 Lys Leu Tyr Ser Gly Phe Cys Ala Asn His Ile Lys Val Gln Lys Val 1200 1205 1210 ctg gag cga gct aaa act gac aaa gcc ttc aag gct ttt ctg gat gcc 3937 Leu Glu Arg Ala Lys Thr Asp Lys Ala Phe Lys Ala Phe Leu Asp Ala 1215 1220 1225 1230 cgg aac ccc acc aag cag cat tcc tcc acg ctg gag tcc tac ctc atc 3985 Arg Asn Pro Thr Lys Gln His Ser Ser Thr Leu Glu Ser Tyr Leu Ile 1235 1240 1245 aag ccg gtt cag aga gtg ctc aag tac ccg ctg ctg ctc aag gag ctg 4033 Lys Pro Val Gln Arg Val Leu Lys Tyr Pro Leu Leu Leu Lys Glu Leu 1250 1255 1260 gtg tcc ctg acg gac cag gag agc gag gag cac tac cac ctg acg gaa 4081 Val Ser Leu Thr Asp Gln Glu Ser Glu Glu His Tyr His Leu Thr Glu 1265 1270 1275 gca cta aag gca atg gag aaa gta gcg agc cac atc aat gag atg cag 4129 Ala Leu Lys Ala Met Glu Lys Val Ala Ser His Ile Asn Glu Met Gln 1280 1285 1290 aag atc tat gag gat tat ggg acc gtg ttt gac cag cta gta gct gag 4177 Lys Ile Tyr Glu Asp Tyr Gly Thr Val Phe Asp Gln Leu Val Ala Glu 1295 1300 1305 1310 cag agc gga aca gag aag gag gta aca gaa ctt tcg atg gga gag ctt 4225 Gln Ser Gly Thr Glu Lys Glu Val Thr Glu Leu Ser Met Gly Glu Leu 1315 1320 1325 ctg atg cac tct acg gtt tcc tgg ttg aat cca ttt ctg tct cta gga 4273 Leu Met His Ser Thr Val Ser Trp Leu Asn Pro Phe Leu Ser Leu Gly 1330 1335 1340 aaa gct aga aag gac ctt gag ctc aca gta ttt gtt ttt aag aga gcc 4321 Lys Ala Arg Lys Asp Leu Glu Leu Thr Val Phe Val Phe Lys Arg Ala 1345 1350 1355 gtc ata ctg gtt tat aaa gaa aac tgc aaa ctg aaa aag aaa ttg ccc 4369 Val Ile Leu Val Tyr Lys Glu Asn Cys Lys Leu Lys Lys Lys Leu Pro 1360 1365 1370 tcg aat tcc cgg cct gca cac aac tct act gac ttg gac cca ttt aaa 4417 Ser Asn Ser Arg Pro Ala His Asn Ser Thr Asp Leu Asp Pro Phe Lys 1375 1380 1385 1390 ttc cgc tgg ttg atc ccc atc tcc gcg ctt caa gtc aga ctg ggg aat 4465 Phe Arg Trp Leu Ile Pro Ile Ser Ala Leu Gln Val Arg Leu Gly Asn 1395 1400 1405 cca gca ggg aca gaa aat aat tcc ata tgg gaa ctg atc cat acg aag 4513 Pro Ala Gly Thr Glu Asn Asn Ser Ile Trp Glu Leu Ile His Thr Lys 1410 1415 1420 tca gaa ata gaa gga cgg cca gaa acc atc ttt cag ttg tgt tgc agt 4561 Ser Glu Ile Glu Gly Arg Pro Glu Thr Ile Phe Gln Leu Cys Cys Ser 1425 1430 1435 gac agt gaa agc aaa acc aac att gtt aag gtg att cgt tct att ctg 4609 Asp Ser Glu Ser Lys Thr Asn Ile Val Lys Val Ile Arg Ser Ile Leu 1440 1445 1450 agg gaa aac ttc agg cgt cac ata aag tgt gaa tta cca ctg gag aaa 4657 Arg Glu Asn Phe Arg Arg His Ile Lys Cys Glu Leu Pro Leu Glu Lys 1455 1460 1465 1470 acg tgt aag gat cgc ctg gta cct ctt aag aac cga gtt cct gtt tcg 4705 Thr Cys Lys Asp Arg Leu Val Pro Leu Lys Asn Arg Val Pro Val Ser 1475 1480 1485 gcc aaa tta gct tca tcc agg tct tta aaa gtc ctg aag aat tcc tcc 4753 Ala Lys Leu Ala Ser Ser Arg Ser Leu Lys Val Leu Lys Asn Ser Ser 1490 1495 1500 agc aac gag tgg acc ggt gag act ggc aag gga acc ttg ctg gac tct 4801 Ser Asn Glu Trp Thr Gly Glu Thr Gly Lys Gly Thr Leu Leu Asp Ser 1505 1510 1515 gac gag ggc agc ttg agc agc ggc acc cag agc agc ggc tgc ccc acg 4849 Asp Glu Gly Ser Leu Ser Ser Gly Thr Gln Ser Ser Gly Cys Pro Thr 1520 1525 1530 gct gag ggc agg cag gac tcc aag agc act tct ccc ggg aaa tac cca 4897 Ala Glu Gly Arg Gln Asp Ser Lys Ser Thr Ser Pro Gly Lys Tyr Pro 1535 1540 1545 1550 cac ccc ggc ttg gca gat ttt gcc gac aat ctc atc aaa gag agt gac 4945 His Pro Gly Leu Ala Asp Phe Ala Asp Asn Leu Ile Lys Glu Ser Asp 1555 1560 1565 atc ctg agc gat gaa gat gat gac cac cgt cag act gtg aag cag ggc 4993 Ile Leu Ser Asp Glu Asp Asp Asp His Arg Gln Thr Val Lys Gln Gly 1570 1575 1580 agc cct act aaa gac atc gaa att cag ttc cag aga ctg agg att tcc 5041 Ser Pro Thr Lys Asp Ile Glu Ile Gln Phe Gln Arg Leu Arg Ile Ser 1585 1590 1595 gag gac cca gac gtt cac ccc gag gct gag cag cag cct ggc ccg gag 5089 Glu Asp Pro Asp Val His Pro Glu Ala Glu Gln Gln Pro Gly Pro Glu 1600 1605 1610 tcg ggt gag ggt cag aaa gga gga gag cag ccc aaa ctg gtc cgg ggg 5137 Ser Gly Glu Gly Gln Lys Gly Gly Glu Gln Pro Lys Leu Val Arg Gly 1615 1620 1625 1630 cac ttc tgc ccc att aaa cga aaa gcc aac agc acc aag agg gac aga 5185 His Phe Cys Pro Ile Lys Arg Lys Ala Asn Ser Thr Lys Arg Asp Arg 1635 1640 1645 gga act ttg ctc aag gcg cag atc cgt cac cag tcc ctt gac agt cag 5233 Gly Thr Leu Leu Lys Ala Gln Ile Arg His Gln Ser Leu Asp Ser Gln 1650 1655 1660 tct gaa aat gcc acc atc gac cta aat tct gtt cta gag cga gaa ttc 5281 Ser Glu Asn Ala Thr Ile Asp Leu Asn Ser Val Leu Glu Arg Glu Phe 1665 1670 1675 agt gtc cag agt tta aca tct gtt gtc agt gag gag tgt ttt tat gaa 5329 Ser Val Gln Ser Leu Thr Ser Val Val Ser Glu Glu Cys Phe Tyr Glu 1680 1685 1690 aca gag agc cac gga aaa tca tagtatgatt caatccagat atgggttaaa 5380 Thr Glu Ser His Gly Lys Ser 1695 1700 ttcctcattt tacttttaaa ctggtggtaa agtggaaatt gcaaaaaaaa aaaaaaa 5437 62 1701 PRT Homo sapiens 62 Met Gly Asn Ser Asp Ser Gln Tyr Thr Leu Gln Gly Ser Lys Asn His 1 5 10 15 Ser Asn Thr Ile Thr Gly Ala Lys Gln Ile Pro Cys Ser Leu Lys Ile 20 25 30 Arg Gly Val His Ala Lys Glu Glu Lys Ser Leu His Gly Trp Gly His 35 40 45 Gly Ser Asn Gly Ala Gly Tyr Lys Ser Arg Ser Leu Ala Arg Ser Cys 50 55 60 Leu Ser His Phe Lys Ser Asn Gln Pro Tyr Ala Ser Arg Leu Gly Gly 65 70 75 80 Pro Thr Cys Lys Val Ser Arg Gly Val Ala Tyr Ser Thr His Arg Thr 85 90 95 Asn Ala Pro Gly Lys Asp Phe Gln Gly Ile Ser Ala Ala Phe Ser Thr 100 105 110 Glu Asn Gly Phe His Ser Val Gly His Glu Leu Ala Asp Asn His Ile 115 120 125 Thr Ser Arg Asp Cys Asn Gly His Leu Leu Asn Cys Tyr Gly Arg Asn 130 135 140 Glu Ser Ile Ala Ser Thr Pro Pro Gly Glu Asp Arg Lys Ser Pro Arg 145 150 155 160 Val Leu Ile Lys Thr Leu Gly Lys Pro Asp Gly Cys Leu Arg Val Glu 165 170 175 Phe His Asn Gly Gly Asn Pro Ser Lys Val Pro Ala Glu Asp Cys Ser 180 185 190 Glu Pro Val Gln Leu Leu Arg Tyr Ser Pro Thr Leu Ala Ser Glu Thr 195 200 205 Ser Pro Val Pro Glu Ala Arg Arg Gly Ser Ser Ala Asp Ser Leu Pro 210 215 220 Ser His Arg Pro Ser Pro Thr Asp Ser Arg Leu Arg Ser Ser Lys Gly 225 230 235 240 Ser Ser Leu Ser Ser Glu Ser Ser Trp Tyr Asp Ser Pro Trp Gly Asn 245 250 255 Ala Gly Glu Leu Ser Glu Ala Glu Gly Ser Phe Leu Ala Pro Gly Met 260 265 270 Pro Asp Pro Ser Leu His Ala Ser Phe Pro Pro Gly Asp Ala Lys Lys 275 280 285 Pro Phe Asn Gln Ser Ser Ser Leu Ser Ser Leu Arg Glu Leu Tyr Lys 290 295 300 Asp Ala Asn Leu Gly Ser Leu Ser Pro Ser Gly Ile Arg Leu Ser Asp 305 310 315 320 Glu Tyr Met Gly Thr His Ala Ser Leu Ser Asn Arg Val Ser Phe Ala 325 330 335 Ser Asp Ile Asp Val Pro Ser Arg Val Ala His Gly Asp Pro Ile Gln 340 345 350 Tyr Ser Ser Phe Thr Leu Pro Cys Arg Lys Pro Lys Ala Phe Val Glu 355 360 365 Asp Thr Ala Lys Lys Asp Ser Leu Lys Ala Arg Met Arg Arg Ile Ser 370 375 380 Asp Trp Thr Gly Ser Leu Ser Arg Lys Lys Arg Lys Leu Gln Glu Pro 385 390 395 400 Arg Ser Lys Glu Gly Ser Asp Tyr Phe Asp Ser Arg Ser Asp Gly Leu 405 410 415 Asn Thr Asp Val Gln Gly Ser Ser Gln Ala Ser Ala Phe Leu Trp Ser 420 425 430 Gly Gly Ser Thr Gln Ile Leu Ser Gln Arg Ser Glu Ser Thr His Ala 435 440 445 Ile Gly Ser Asp Pro Leu Arg Gln Asn Ile Tyr Glu Asn Phe Met Arg 450 455 460 Glu Leu Glu Met Ser Arg Thr Asn Thr Glu Asn Ile Glu Thr Ser Thr 465 470 475 480 Glu Thr Ala Glu Ser Ser Ser Glu Ser Leu Ser Ser Leu Glu Gln Leu 485 490 495 Asp Leu Leu Phe Glu Lys Glu Gln Gly Val Val Arg Arg Ala Gly Trp 500 505 510 Leu Phe Phe Lys Pro Leu Val Thr Val Gln Lys Glu Arg Lys Leu Glu 515 520 525 Leu Val Ala Arg Arg Lys Trp Lys Gln Tyr Trp Val Thr Leu Lys Gly 530 535 540 Cys Thr Leu Leu Phe Tyr Glu Thr Tyr Gly Lys Asn Ser Met Asp Gln 545 550 555 560 Ser Ser Ala Pro Arg Cys Ala Leu Phe Ala Glu Asp Ser Ile Val Gln 565 570 575 Ser Val Pro Glu His Pro Lys Lys Glu Asn Val Phe Cys Leu Ser Asn 580 585 590 Ser Phe Gly Asp Val Tyr Leu Phe Gln Ala Thr Ser Gln Thr Asp Leu 595 600 605 Glu Asn Trp Val Thr Ala Val His Ser Ala Cys Ala Ser Leu Phe Ala 610 615 620 Lys Lys His Gly Lys Glu Asp Thr Leu Arg Leu Leu Lys Asn Gln Thr 625 630 635 640 Lys Asn Leu Leu Gln Lys Ile Asp Met Asp Ser Lys Met Lys Lys Met 645 650 655 Ala Glu Leu Gln Leu Ser Val Val Ser Asp Pro Lys Asn Arg Lys Ala 660 665 670 Ile Glu Asn Gln Ile Gln Gln Trp Glu Gln Asn Leu Glu Lys Phe His 675 680 685 Met Asp Leu Phe Arg Met Arg Cys Tyr Leu Ala Ser Leu Gln Gly Gly 690 695 700 Glu Leu Pro Asn Pro Lys Ser Leu Leu Ala Ala Ala Ser Arg Pro Ser 705 710 715 720 Lys Leu Ala Leu Gly Arg Leu Gly Ile Leu Ser Val Ser Ser Phe His 725 730 735 Ala Leu Val Cys Ser Arg Asp Asp Ser Ala Leu Arg Lys Arg Thr Leu 740 745 750 Ser Leu Thr Gln Arg Gly Arg Asn Lys Lys Gly Ile Phe Ser Ser Leu 755 760 765 Lys Gly Leu Asp Thr Leu Ala Arg Lys Gly Lys Glu Lys Arg Pro Ser 770 775 780 Ile Thr Gln Val Asp Glu Leu Leu His Ile Tyr Gly Ser Thr Val Asp 785 790 795 800 Gly Val Pro Arg Asp Asn Ala Trp Glu Ile Gln Thr Tyr Val His Phe 805 810 815 Gln Asp Asn His Gly Val Thr Val Gly Ile Lys Pro Glu His Arg Val 820 825 830 Glu Asp Ile Leu Thr Leu Ala Cys Lys Met Arg Gln Leu Glu Pro Ser 835 840 845 His Tyr Gly Leu Gln Leu Arg Lys Leu Val Asp Asp Asn Val Glu Tyr 850 855 860 Cys Ile Pro Ala Pro Tyr Glu Tyr Met Gln Gln Gln Val Tyr Asp Glu 865 870 875 880 Ile Glu Val Phe Pro Leu Asn Val Tyr Asp Val Gln Leu Thr Lys Thr 885 890 895 Gly Ser Val Cys Asp Phe Gly Phe Ala Val Thr Ala Gln Val Asp Glu 900 905 910 Arg Gln His Leu Ser Arg Ile Phe Ile Ser Asp Val Leu Pro Asp Gly 915 920 925 Leu Ala Tyr Gly Glu Gly Leu Arg Lys Gly Asn Glu Ile Met Thr Leu 930 935 940 Asn Gly Glu Ala Val Ser Asp Leu Asp Leu Lys Gln Met Glu Ala Leu 945 950 955 960 Phe Ser Glu Lys Ser Val Gly Leu Thr Leu Ile Ala Arg Pro Pro Asp 965 970 975 Thr Lys Ala Thr Leu Cys Thr Ser Trp Ser Asp Ser Asp Leu Phe Ser 980 985 990 Arg Asp Gln Lys Ser Leu Leu Pro Pro Pro Asn Gln Ser Gln Leu Leu 995 1000 1005 Glu Glu Phe Leu Asp Asn Phe Lys Lys Asn Thr Ala Asn Asp Phe Ser 1010 1015 1020 Asn Val Pro Asp Ile Thr Thr Gly Leu Lys Arg Ser Gln Thr Asp Gly 1025 1030 1035 1040 Thr Leu Asp Gln Val Ser His Arg Glu Lys Met Glu Gln Thr Phe Arg 1045 1050 1055 Ser Ala Glu Gln Ile Thr Ala Leu Cys Arg Ser Phe Asn Asp Ser Gln 1060 1065 1070 Ala Asn Gly Met Glu Gly Pro Arg Glu Asn Gln Asp Pro Pro Pro Arg 1075 1080 1085 Pro Leu Ala Arg His Leu Ser Asp Ala Asp Arg Leu Arg Lys Val Ile 1090 1095 1100 Gln Glu Leu Val Asp Thr Glu Lys Ser Tyr Val Lys Asp Leu Ser Cys 1105 1110 1115 1120 Leu Phe Glu Leu Tyr Leu Glu Pro Leu Gln Asn Glu Thr Phe Leu Thr 1125 1130 1135 Gln Asp Glu Met Glu Ser Leu Phe Gly Ser Leu Pro Glu Met Leu Glu 1140 1145 1150 Phe Gln Lys Val Phe Leu Glu Thr Leu Glu Asp Gly Ile Ser Ala Ser 1155 1160 1165 Ser Asp Phe Asn Thr Leu Glu Thr Pro Ser Gln Phe Arg Lys Leu Leu 1170 1175 1180 Phe Ser Leu Gly Gly Ser Phe Leu Tyr Tyr Ala Asp His Phe Lys Leu 1185 1190 1195 1200 Tyr Ser Gly Phe Cys Ala Asn His Ile Lys Val Gln Lys Val Leu Glu 1205 1210 1215 Arg Ala Lys Thr Asp Lys Ala Phe Lys Ala Phe Leu Asp Ala Arg Asn 1220 1225 1230 Pro Thr Lys Gln His Ser Ser Thr Leu Glu Ser Tyr Leu Ile Lys Pro 1235 1240 1245 Val Gln Arg Val Leu Lys Tyr Pro Leu Leu Leu Lys Glu Leu Val Ser 1250 1255 1260 Leu Thr Asp Gln Glu Ser Glu Glu His Tyr His Leu Thr Glu Ala Leu 1265 1270 1275 1280 Lys Ala Met Glu Lys Val Ala Ser His Ile Asn Glu Met Gln Lys Ile 1285 1290 1295 Tyr Glu Asp Tyr Gly Thr Val Phe Asp Gln Leu Val Ala Glu Gln Ser 1300 1305 1310 Gly Thr Glu Lys Glu Val Thr Glu Leu Ser Met Gly Glu Leu Leu Met 1315 1320 1325 His Ser Thr Val Ser Trp Leu Asn Pro Phe Leu Ser Leu Gly Lys Ala 1330 1335 1340 Arg Lys Asp Leu Glu Leu Thr Val Phe Val Phe Lys Arg Ala Val Ile 1345 1350 1355 1360 Leu Val Tyr Lys Glu Asn Cys Lys Leu Lys Lys Lys Leu Pro Ser Asn 1365 1370 1375 Ser Arg Pro Ala His Asn Ser Thr Asp Leu Asp Pro Phe Lys Phe Arg 1380 1385 1390 Trp Leu Ile Pro Ile Ser Ala Leu Gln Val Arg Leu Gly Asn Pro Ala 1395 1400 1405 Gly Thr Glu Asn Asn Ser Ile Trp Glu Leu Ile His Thr Lys Ser Glu 1410 1415 1420 Ile Glu Gly Arg Pro Glu Thr Ile Phe Gln Leu Cys Cys Ser Asp Ser 1425 1430 1435 1440 Glu Ser Lys Thr Asn Ile Val Lys Val Ile Arg Ser Ile Leu Arg Glu 1445 1450 1455 Asn Phe Arg Arg His Ile Lys Cys Glu Leu Pro Leu Glu Lys Thr Cys 1460 1465 1470 Lys Asp Arg Leu Val Pro Leu Lys Asn Arg Val Pro Val Ser Ala Lys 1475 1480 1485 Leu Ala Ser Ser Arg Ser Leu Lys Val Leu Lys Asn Ser Ser Ser Asn 1490 1495 1500 Glu Trp Thr Gly Glu Thr Gly Lys Gly Thr Leu Leu Asp Ser Asp Glu 1505 1510 1515 1520 Gly Ser Leu Ser Ser Gly Thr Gln Ser Ser Gly Cys Pro Thr Ala Glu 1525 1530 1535 Gly Arg Gln Asp Ser Lys Ser Thr Ser Pro Gly Lys Tyr Pro His Pro 1540 1545 1550 Gly Leu Ala Asp Phe Ala Asp Asn Leu Ile Lys Glu Ser Asp Ile Leu 1555 1560 1565 Ser Asp Glu Asp Asp Asp His Arg Gln Thr Val Lys Gln Gly Ser Pro 1570 1575 1580 Thr Lys Asp Ile Glu Ile Gln Phe Gln Arg Leu Arg Ile Ser Glu Asp 1585 1590 1595 1600 Pro Asp Val His Pro Glu Ala Glu Gln Gln Pro Gly Pro Glu Ser Gly 1605 1610 1615 Glu Gly Gln Lys Gly Gly Glu Gln Pro Lys Leu Val Arg Gly His Phe 1620 1625 1630 Cys Pro Ile Lys Arg Lys Ala Asn Ser Thr Lys Arg Asp Arg Gly Thr 1635 1640 1645 Leu Leu Lys Ala Gln Ile Arg His Gln Ser Leu Asp Ser Gln Ser Glu 1650 1655 1660 Asn Ala Thr Ile Asp Leu Asn Ser Val Leu Glu Arg Glu Phe Ser Val 1665 1670 1675 1680 Gln Ser Leu Thr Ser Val Val Ser Glu Glu Cys Phe Tyr Glu Thr Glu 1685 1690 1695 Ser His Gly Lys Ser 1700 63 5106 DNA Homo sapiens 63 atgggcaact ccgacagtca gtacaccctt caaggatcta aaaatcatag caatactatt 60 actggtgcta agcaaattcc ttgctccctg aaaatacgtg gcgttcatgc aaaagaggaa 120 aagtcattgc atggatgggg tcacgggagc aacggagcag gttacaagtc caggtccctg 180 gcccgaagct gcctttctca ctttaagagt aaccagcctt acgcatcgag actcggtggc 240 cccacatgca aggtctccag aggtgttgcc tactccacgc acaggacaaa tgccccaggg 300 aaggatttcc agggcatcag tgctgctttc tcaactgaga atggcttcca ttctgttggc 360 cacgagctgg cagataacca catcacctcc agagactgca acggacacct tctcaactgc 420 tacgggagga atgagagcat tgcctccacc ccaccgggcg aagaccgcaa gagcccccga 480 gtgctcatca aaacgctggg gaagccggat gggtgtttaa gggtcgagtt ccacaatggt 540 ggcaacccca gcaaagtgcc tgcagaggac tgcagtgagc cggtgcagct gctgaggtac 600 tcacctacct tagcatcgga aacctcccct gtgcctgaag ccaggagggg gtccagcgcc 660 gattccctgc ccagccatcg cccctctccc acggactctc gcctgcggtc cagcaaaggc 720 agctccctga gttctgagtc atcctggtac gactcccctt ggggcaatgc tggagagctg 780 agcgaggctg agggctcctt cctggccccc ggcatgcctg accccagtct ccatgccagc 840 ttcccacctg gcgatgccaa aaagcctttc aaccaaagct cttccctctc ctccctccgg 900 gaactgtaca aagatgccaa cctggggagc ctctccccct caggtatccg cctttctgat 960 gaatacatgg gcacgcatgc cagcctgagc aaccgtgtct cttttgcttc cgacattgat 1020 gtgccctcca gagtggcaca cggggacccc atccagtaca gttccttcac tctcccctgt 1080 cggaagccca aagcctttgt tgaggatact gcgaagaagg actccctcaa agccaggatg 1140 cgacggatca gtgactggac gggaagcctc tcaaggaaga aaaggaaact ccaggagccg 1200 aggtccaagg agggcagtga ctactttgac agtcgctctg atggactgaa tacagatgtg 1260 cagggatcct cccaggcatc tgcttttctg tggtcagggg gctctactca gatcctgtct 1320 cagagaagtg aatccacaca tgcgattggc agcgatcccc tccggcagaa catttatgag 1380 aatttcatgc gagagttgga aatgagcagg accaacactg agaacataga aacatctaca 1440 gaaaccgccg agtccagcag cgagtcactc agctctctgg aacagctgga tctgctcttc 1500 gagaaggaac agggggtggt ccggagggcc gggtggctct tcttcaagcc cctggtcact 1560 gtgcagaagg aaaggaagct tgagctggtg gcacgaagga aatggaaaca gtactgggta 1620 acgctgaaag gatgcacgct gctgttttat gagacctatg ggaagaattc catggatcag 1680 agcagtgccc ctcggtgtgc tctgtttgca gaagacagca tagtgcagtc tgttccagag 1740 catcccaaga aagaaaatgt gttctgcctc agcaactcct ttggagatgt ctaccttttc 1800 caggccacca gccagacaga tctagaaaac tgggtcactg ctgtacactc tgcttgtgca 1860 tccctttttg caaagaagca tgggaaagag gacacgctgc ggctgctgaa gaaccagacc 1920 aaaaacctgc ttcagaagat agacatggac agcaagatga agaagatggc agagctgcag 1980 ctgtccgtgg tgagcgaccc aaagaacagg aaagccatag agaaccagat ccagcaatgg 2040 gagcagaatc ttgagaaatt tcacatggat ctgttcagga tgcgctgcta tctggccagc 2100 ctacaaggtg gggagttacc gaacccaaag agtctccttg cagccgccag ccgcccctcc 2160 aagctggccc tcggcaggct gggcatcttg tctgtttcct ctttccatgc tctggtatgt 2220 tctagagatg actctgctct ccggaaaagg acactgtcac tgacccagcg agggagaaac 2280 aagaagggaa tattttcttc gttaaaaggg ctggacacac tggccagaaa aggcaaggag 2340 aagagacctt ctataactca ggtcgatgaa cttctgcata tatatggttc aacagtggac 2400 ggtgttcccc gagacaatgc atgggaaatc cagacttatg ttcactttca ggacaatcac 2460 ggagttactg tagggatcaa gccagagcac agagtagaag atattttgac tttggcatgc 2520 aagatgaggc agttggaacc cagccattat ggcctacagc ttcgaaaatt agtagatgac 2580 aatgttgagt attgcatccc tgcaccatat gaatatatgc aacaacaggt ttatgatgaa 2640 atagaagtct ttccactaaa tgtttatgat gtgcagctca cgaagactgg gagtgtgtgt 2700 gactttgggt ttgcagttac agcgcaggtg gatgagcgtc agcatctcag ccggatattt 2760 ataagcgacg ttcttcccga tggcctggcg tatggggaag ggctgagaaa gggcaatgag 2820 atcatgacct taaatgggga agctgtgtct gatcttgacc ttaagcagat ggaggccctg 2880 ttttctgaga agagcgtcgg actcactctg attgcccggc ctccggacac aaaagcaacc 2940 ctgtgtacat cctggtcaga cagtgacctg ttctccaggg accagaagag tctgctgccc 3000 cctcctaacc agtcccaact gctggaggaa ttcctggata actttaaaaa gaatacagcc 3060 aatgatttca gcaacgtccc tgatatcaca acaggtctga aaaggagtca gacagatggc 3120 actctggatc aggtttccca cagggagaaa atggagcaga cattcaggag tgctgagcag 3180 atcactgcac tgtgcaggag ttttaacgac agtcaggcca acggcatgga aggaccgcgg 3240 gagaatcagg atcctcctcc gaggcctctg gcccgccacc tgtctgatgc agaccgcctc 3300 cgcaaagtca tccaggagct tgtggacaca gagaagtcct acgtgaagga tttgagctgc 3360 ctctttgaat tatacttgga gccacttcag aatgagacct ttcttaccca agatgagatg 3420 gagtcacttt ttggaagttt gccagagatg cttgagtttc agaaggtgtt tctggagacc 3480 ctggaggatg ggatttcagc atcatctgac tttaacaccc tagaaacccc ctcacagttt 3540 agaaaattac tgttttccct tggaggctct ttcctttatt acgcggacca ctttaaactg 3600 tacagtggat tctgtgctaa ccatatcaaa gtacagaagg ttctggagcg agctaaaact 3660 gacaaagcct tcaaggcttt tctggatgcc cggaacccca ccaagcagca ttcctccacg 3720 ctggagtcct acctcatcaa gccggttcag agagtgctca agtacccgct gctgctcaag 3780 gagctggtgt ccctgacgga ccaggagagc gaggagcact accacctgac ggaagcacta 3840 aaggcaatgg agaaagtagc gagccacatc aatgagatgc agaagatcta tgaggattat 3900 gggaccgtgt ttgaccagct agtagctgag cagagcggaa cagagaagga ggtaacagaa 3960 ctttcgatgg gagagcttct gatgcactct acggtttcct ggttgaatcc atttctgtct 4020 ctaggaaaag ctagaaagga ccttgagctc acagtatttg tttttaagag agccgtcata 4080 ctggtttata aagaaaactg caaactgaaa aagaaattgc cctcgaattc ccggcctgca 4140 cacaactcta ctgacttgga cccatttaaa ttccgctggt tgatccccat ctccgcgctt 4200 caagtcagac tggggaatcc agcagggaca gaaaataatt ccatatggga actgatccat 4260 acgaagtcag aaatagaagg acggccagaa accatctttc agttgtgttg cagtgacagt 4320 gaaagcaaaa ccaacattgt taaggtgatt cgttctattc tgagggaaaa cttcaggcgt 4380 cacataaagt gtgaattacc actggagaaa acgtgtaagg atcgcctggt acctcttaag 4440 aaccgagttc ctgtttcggc caaattagct tcatccaggt ctttaaaagt cctgaagaat 4500 tcctccagca acgagtggac cggtgagact ggcaagggaa ccttgctgga ctctgacgag 4560 ggcagcttga gcagcggcac ccagagcagc ggctgcccca cggctgaggg caggcaggac 4620 tccaagagca cttctcccgg gaaataccca caccccggct tggcagattt tgccgacaat 4680 ctcatcaaag agagtgacat cctgagcgat gaagatgatg accaccgtca gactgtgaag 4740 cagggcagcc ctactaaaga catcgaaatt cagttccaga gactgaggat ttccgaggac 4800 ccagacgttc accccgaggc tgagcagcag cctggcccgg agtcgggtga gggtcagaaa 4860 ggaggagagc agcccaaact ggtccggggg cacttctgcc ccattaaacg aaaagccaac 4920 agcaccaaga gggacagagg aactttgctc aaggcgcaga tccgtcacca gtcccttgac 4980 agtcagtctg aaaatgccac catcgaccta aattctgttc tagagcgaga attcagtgtc 5040 cagagtttaa catctgttgt cagtgaggag tgtttttatg aaacagagag ccacggaaaa 5100 tcatag 5106 64 85 PRT Artificial Sequence consensus sequence 64 Val Ile Lys Glu Gly Trp Leu Leu Lys Lys Ser Lys Ser Trp Lys Lys 1 5 10 15 Arg Tyr Phe Val Leu Phe Asn Asn Val Leu Leu Tyr Tyr Lys Asp Ser 20 25 30 Lys Lys Lys Pro Lys Gly Ser Ile Pro Leu Ser Gly Cys Gln Val Glu 35 40 45 Lys Pro Asp Lys Asn Cys Phe Glu Ile Arg Thr Asp Arg Thr Leu Leu 50 55 60 Leu Gln Ala Glu Ser Glu Glu Glu Arg Lys Glu Trp Val Lys Ala Ile 65 70 75 80 Gln Ser Ala Ile Arg 85 65 77 PRT Artificial Sequence consensus sequence 65 Lys Thr Ile Arg Val His Leu Pro Asn Asn Gln Arg Ser Val Val Glu 1 5 10 15 Val Arg Pro Gly Met Thr Val Arg Asp Ala Leu Ala Lys Ala Leu Lys 20 25 30 Lys Arg Gly Leu Asn Pro Ser Ala Cys Val Val Arg Arg Ser Gly Asp 35 40 45 Pro Gln Glu Gly Glu Lys Lys Pro Leu Asp Leu Asp Thr Asp Ile Ser 50 55 60 Ser Leu Pro Gly Pro Glu Glu Leu Val Val Glu Asn Leu 65 70 75 66 83 PRT Artificial Sequence consensus sequence 66 Glu Ile Thr Leu Glu Lys Glu Val Lys Arg Gly Gly Leu Gly Phe Ser 1 5 10 15 Ile Lys Gly Gly Ser Asp Lys Gly Ile Val Val Ser Glu Val Leu Pro 20 25 30 Gly Ser Gly Ala Ala Glu Ala Gly Gly Arg Leu Lys Glu Gly Asp Val 35 40 45 Ile Leu Ser Val Asn Gly Gln Asp Val Glu Asn Met Ser His Glu Arg 50 55 60 Ala Val Leu Ala Ile Lys Gly Ser Gly Gly Glu Val Thr Leu Thr Val 65 70 75 80 Leu Arg Asp 67 207 PRT Artificial Sequence consensus sequence 67 Val Leu Lys Glu Leu Leu Glu Thr Glu Lys Lys Tyr Val Arg Asp Leu 1 5 10 15 Glu Ile Leu Asp Asn Val Tyr Met Lys Pro Leu Arg Glu Ala Ala Ile 20 25 30 Ser Ser Lys Pro Val Leu Thr Pro Asp Asp Ile Glu Thr Ile Phe Ser 35 40 45 Asn Ile Glu Asp Ile Tyr Glu Phe His Arg Glu Phe Leu Lys Ser Ser 50 55 60 Leu Glu Ala Arg Ile Ser Ser Ser Gln Phe Glu Asp Leu Asp Glu Lys 65 70 75 80 Lys Ile Glu Pro Ser Ala Pro Arg Leu Gly Asp Leu Phe Leu Lys Leu 85 90 95 Lys Glu Pro Phe Leu Gln Val Tyr Gly Glu Tyr Cys Ser Asn Lys Pro 100 105 110 Tyr Ala Gln Glu Leu Leu Glu Lys Leu Arg Gln Ala Ala Ser Asn Pro 115 120 125 Gln Phe Ala Glu Phe Leu Asp Glu Val Glu Ala Ser Ser Asn Thr Gly 130 135 140 Ala Lys Asp Asp Ala Val Lys Leu Thr Leu Gln Ser Leu Leu Leu Lys 145 150 155 160 Pro Val Gln Arg Ile Leu Arg Tyr Pro Leu Leu Leu Lys Glu Leu Leu 165 170 175 Lys His Thr Pro Glu Gly Glu Asp Gln Pro Asp Arg Glu Asp Leu Lys 180 185 190 Lys Ala Leu Asp Leu Leu Gln Asp Leu Ala Lys Ser Ile Asn Glu 195 200 205 68 67 PRT Artificial Sequence consensus sequence 68 Phe Val Leu Phe Asn Asn Val Leu Leu Tyr Tyr Lys Asp Ser Lys Lys 1 5 10 15 Lys Pro Lys Gly Ser Ile Pro Leu Ser Gly Cys Gln Val Glu Lys Pro 20 25 30 Asp Lys Asn Cys Phe Glu Ile Arg Thr Asp Arg Thr Leu Leu Leu Gln 35 40 45 Ala Glu Ser Glu Glu Glu Arg Lys Glu Trp Val Lys Ala Ile Gln Ser 50 55 60 Ala Ile Arg 65 69 82 PRT Artificial Sequence consensus sequence 69 Val Ile Lys Glu Gly Trp Leu Leu Lys Lys Ser Lys Ser Trp Lys Lys 1 5 10 15 Arg Tyr Phe Val Leu Phe Asn Gly Val Leu Leu Tyr Tyr Lys Ser Lys 20 25 30 Lys Pro Lys Gly Ser Ile Pro Leu Ser Gly Cys Ser Val Arg Glu Pro 35 40 45 Cys Phe Glu Ile Val Thr Asp Arg Thr Leu Leu Leu Gln Ala Glu Ser 50 55 60 Glu Glu Glu Arg Glu Glu Trp Val Glu Ala Leu Gln Ser Ala Ile Ala 65 70 75 80 Lys Ala 70 76 PRT Artificial Sequence consensus sequence 70 Lys Thr Cys Arg Val His Leu Pro Asp Asn Gln Arg Thr Val Val Lys 1 5 10 15 Val Arg Pro Gly Lys Thr Val Arg Asp Ala Leu Ala Lys Ala Leu Lys 20 25 30 Lys Arg Gly Leu Asn Pro Glu Ala Cys Val Val Arg Leu Arg Gly Asp 35 40 45 Pro Gln Glu Gly Glu Lys Lys Pro Leu Asp Leu Asn Gln Asp Ile Ser 50 55 60 Ser Leu Ala Gly Gln Glu Leu Val Val Glu Glu Leu 65 70 75 71 80 PRT Artificial Sequence consensus sequence 71 Gly Gly Leu Gly Phe Ser Ile Val Gly Gly Ile Phe Val Ser Ser Val 1 5 10 15 Val Pro Gly Ser Pro Ala Ala Lys Ala Gly Arg Lys Ser Leu Gly Leu 20 25 30 Leu Lys Val Gly Asp Val Ile Leu Glu Val Asn Gly Glu Thr Ser Val 35 40 45 Glu Gly Leu Thr His Glu Glu Ala Val Asp Leu Leu Lys Lys Ala Gly 50 55 60 Gly Gly Gly Val Gly Glu Lys Val Thr Leu Thr Val Leu Arg Gly Gly 65 70 75 80 72 211 PRT Artificial Sequence consensus sequence 72 Val Leu Lys Glu Leu Leu Gln Thr Glu Arg Asn Tyr Val Arg Asp Leu 1 5 10 15 Lys Ile Leu Val Glu Val Phe Leu Lys Pro Leu Lys Lys Glu Ala Lys 20 25 30 Ser Ser Leu Leu Pro Leu Leu Ser Pro Asp Glu Val Lys Thr Leu Phe 35 40 45 Gly Pro Asn Ile Glu Glu Ile Tyr Glu Phe His Arg Arg Phe Leu Asp 50 55 60 Glu Leu Glu Glu Arg Val Glu Glu Trp Leu Leu Ser Lys Asp Leu Lys 65 70 75 80 Ser Glu Arg Asn Ser Val Ile Glu Asp Ser Gly Glu Arg Ile Gly Asp 85 90 95 Val Phe Leu Lys Leu Phe Ser Ala Glu Glu Phe Phe Lys Ile Tyr Ser 100 105 110 Glu Tyr Cys Ser Asn His Pro Asp Ala Leu Glu Leu Leu Lys Lys Leu 115 120 125 Met Lys Lys Lys Lys Asn Pro Ala Phe Gln Lys Phe Leu Lys Glu Ile 130 135 140 Glu Ser Lys Pro Asn Cys Arg Ser Lys Ser Glu Ala Arg Leu Thr Leu 145 150 155 160 Glu Ser Leu Leu Ile Lys Pro Val Gln Arg Leu Thr Lys Tyr Pro Leu 165 170 175 Leu Leu Lys Glu Leu Leu Lys His Thr Pro Pro Asp His Glu Asp Arg 180 185 190 Glu Asp Leu Lys Lys Ala Leu Glu Ala Ile Lys Glu Leu Ala Ser Gln 195 200 205 Val Asn Glu 210 73 82 PRT Artificial Sequence consensus sequence 73 Val Ile Lys Glu Gly Trp Leu Leu Lys Lys Ser Lys Ser Trp Lys Lys 1 5 10 15 Arg Tyr Phe Val Leu Phe Asn Gly Val Leu Leu Tyr Tyr Lys Ser Lys 20 25 30 Lys Pro Lys Gly Ser Ile Pro Leu Ser Gly Cys Ser Val Arg Glu Pro 35 40 45 Cys Phe Glu Ile Val Thr Asp Arg Thr Leu Leu Leu Gln Ala Glu Ser 50 55 60 Glu Glu Glu Arg Glu Glu Trp Val Glu Ala Leu Gln Ser Ala Ile Ala 65 70 75 80 Lys Ala 74 3494 DNA Homo sapiens CDS (405)...(3206) 74 tgctgtcgct agattcagat gattcaagtg aggatcaagt ggaaaatagt aaaaattcct 60 ggagttgcaa gtttgttgct gctggagggc ttcaacagtt attagaaatt tttaattctg 120 gaattctaga gcctaaagag caggaatcat ggactgtgtg gcagctagac tgtcttgctt 180 gcttgctgaa gttaatatgc cagtttgcag tagatccatc cgatttggat ttagcttatc 240 atgatgtctt tgcctggtct ggtatagcgg aaagccatag gaaaagaacc tggcctggca 300 aatcaaggaa ggctgctggt gatcatgcta agggtcttca tataccacga ttaacagagg 360 tatttcttgt tcttgtccaa ggaaccagtt tgattcagcg actt atg tct gtt gct 416 Met Ser Val Ala 1 tat acg tat gat aat ctg gct cct aga gtt tta aaa gct cag tct gat 464 Tyr Thr Tyr Asp Asn Leu Ala Pro Arg Val Leu Lys Ala Gln Ser Asp 5 10 15 20 cac agg tct aga cat gaa gtt tca cat tat tca atg tgg ctc ttg gtg 512 His Arg Ser Arg His Glu Val Ser His Tyr Ser Met Trp Leu Leu Val 25 30 35 agt tgg gct cat tgc tgt tct tta gtg aaa tct agc ctt gct gat agc 560 Ser Trp Ala His Cys Cys Ser Leu Val Lys Ser Ser Leu Ala Asp Ser 40 45 50 gat cat tta caa gat tgg cta aag aaa ttg act ctc ctt att cct gag 608 Asp His Leu Gln Asp Trp Leu Lys Lys Leu Thr Leu Leu Ile Pro Glu 55 60 65 act gca gtt cgt cat gaa tca tgc agt ggt ctc tat aag tta tcc ctg 656 Thr Ala Val Arg His Glu Ser Cys Ser Gly Leu Tyr Lys Leu Ser Leu 70 75 80 tca ggg ctg gat gga gga gac tca atc aat cgt tct ttt ctg cta ttg 704 Ser Gly Leu Asp Gly Gly Asp Ser Ile Asn Arg Ser Phe Leu Leu Leu 85 90 95 100 gct gcc tca aca tta ttg aaa ttt ctt cct gat gct caa gca ctc aaa 752 Ala Ala Ser Thr Leu Leu Lys Phe Leu Pro Asp Ala Gln Ala Leu Lys 105 110 115 cct att agg ata gat gat tat gag gaa gaa cca ata tta aaa cca gga 800 Pro Ile Arg Ile Asp Asp Tyr Glu Glu Glu Pro Ile Leu Lys Pro Gly 120 125 130 tgt aaa gag tat ttt tgg ttg tta tgc aaa tta gtt gac aac ata cat 848 Cys Lys Glu Tyr Phe Trp Leu Leu Cys Lys Leu Val Asp Asn Ile His 135 140 145 ata aag gac gct agt cag aca acg ctc ctc gac tta gat gcc ttg gca 896 Ile Lys Asp Ala Ser Gln Thr Thr Leu Leu Asp Leu Asp Ala Leu Ala 150 155 160 aga cat ttg gct gac tgt att cga agt agg gag atc ctt gat cat cag 944 Arg His Leu Ala Asp Cys Ile Arg Ser Arg Glu Ile Leu Asp His Gln 165 170 175 180 gat ggt aat gta gaa gat gat ggg ctt aca gga ctc cta agg ctt gca 992 Asp Gly Asn Val Glu Asp Asp Gly Leu Thr Gly Leu Leu Arg Leu Ala 185 190 195 aca agt gtt gtt aaa cac aaa cca ccc ttt aaa ttt tca agg gaa gga 1040 Thr Ser Val Val Lys His Lys Pro Pro Phe Lys Phe Ser Arg Glu Gly 200 205 210 cag gaa ttt ttg aga gat atc ttc aat ctc ctg ttt ttg ttg cca agt 1088 Gln Glu Phe Leu Arg Asp Ile Phe Asn Leu Leu Phe Leu Leu Pro Ser 215 220 225 cta aag gac cga caa cag cca aag tgc aaa tca cat tct aca aga gct 1136 Leu Lys Asp Arg Gln Gln Pro Lys Cys Lys Ser His Ser Thr Arg Ala 230 235 240 gcc gct tac gat ttg tta gta gag atg gta aag ggg tct gtt gag aac 1184 Ala Ala Tyr Asp Leu Leu Val Glu Met Val Lys Gly Ser Val Glu Asn 245 250 255 260 tac agg cta ata cac aac tgg gtt atg gca caa cac atg cag tcc cat 1232 Tyr Arg Leu Ile His Asn Trp Val Met Ala Gln His Met Gln Ser His 265 270 275 gca cct tat aaa tgg gat tac tgg cct cat gaa gat gtc cgt gct gaa 1280 Ala Pro Tyr Lys Trp Asp Tyr Trp Pro His Glu Asp Val Arg Ala Glu 280 285 290 tgt aga ttt gtt ggc ctt act aac ctt gga gct act tgt tac tta gct 1328 Cys Arg Phe Val Gly Leu Thr Asn Leu Gly Ala Thr Cys Tyr Leu Ala 295 300 305 tct act att cag caa ctt tat atg ata cct gag gca aga cag gct gtc 1376 Ser Thr Ile Gln Gln Leu Tyr Met Ile Pro Glu Ala Arg Gln Ala Val 310 315 320 ttc act gcc aag tat tca gag gat atg aag cac aag acc act ctt ctg 1424 Phe Thr Ala Lys Tyr Ser Glu Asp Met Lys His Lys Thr Thr Leu Leu 325 330 335 340 gag ctt cag aaa atg ttt aca tat tta atg gag agt gaa tgc aaa gca 1472 Glu Leu Gln Lys Met Phe Thr Tyr Leu Met Glu Ser Glu Cys Lys Ala 345 350 355 tat aat cct aga cct ttc tgt aaa aca tac acc atg gat aag cag cct 1520 Tyr Asn Pro Arg Pro Phe Cys Lys Thr Tyr Thr Met Asp Lys Gln Pro 360 365 370 ctg aat act ggg gaa cag aaa gat atg aca gag ttt ttt act gat cta 1568 Leu Asn Thr Gly Glu Gln Lys Asp Met Thr Glu Phe Phe Thr Asp Leu 375 380 385 att acc aaa atc gaa gaa atg tct ccc gaa ctg aaa aat acc gtc aaa 1616 Ile Thr Lys Ile Glu Glu Met Ser Pro Glu Leu Lys Asn Thr Val Lys 390 395 400 agt tta ttt gga ggt gta att aca aac aat gtt gta tcc ttg gat tgt 1664 Ser Leu Phe Gly Gly Val Ile Thr Asn Asn Val Val Ser Leu Asp Cys 405 410 415 420 gaa cat gtt agt caa act gct gaa gag ttt tat act gtg agg tgc caa 1712 Glu His Val Ser Gln Thr Ala Glu Glu Phe Tyr Thr Val Arg Cys Gln 425 430 435 gtg gct gat atg aag aac att tat gaa tct ctt gat gaa gtt act ata 1760 Val Ala Asp Met Lys Asn Ile Tyr Glu Ser Leu Asp Glu Val Thr Ile 440 445 450 aaa gac act ttg gaa ggt gat aac atg tat act tgt tct cat tgt ggg 1808 Lys Asp Thr Leu Glu Gly Asp Asn Met Tyr Thr Cys Ser His Cys Gly 455 460 465 aag aaa gta cga gct gaa aaa agg gca tgt ttt aag aaa ttg cct cgc 1856 Lys Lys Val Arg Ala Glu Lys Arg Ala Cys Phe Lys Lys Leu Pro Arg 470 475 480 att ttg agt ttc aat act atg aga tac aca ttt aat atg gtc acg atg 1904 Ile Leu Ser Phe Asn Thr Met Arg Tyr Thr Phe Asn Met Val Thr Met 485 490 495 500 atg aaa gag aaa gtg aat aca cac ttt tcc ttc cca tta cgt ttg gac 1952 Met Lys Glu Lys Val Asn Thr His Phe Ser Phe Pro Leu Arg Leu Asp 505 510 515 atg acg ccc tat aca gaa gat ttt ctt atg gga aag agt gag agg aaa 2000 Met Thr Pro Tyr Thr Glu Asp Phe Leu Met Gly Lys Ser Glu Arg Lys 520 525 530 gaa ggt ttt aaa gaa gtc agt gat cat tca aaa gac tca gag agc tat 2048 Glu Gly Phe Lys Glu Val Ser Asp His Ser Lys Asp Ser Glu Ser Tyr 535 540 545 gaa tat gac ttg ata gga gtg act gtt cac aca gga acg gca gat ggt 2096 Glu Tyr Asp Leu Ile Gly Val Thr Val His Thr Gly Thr Ala Asp Gly 550 555 560 gga cac tat tat agc ttt atc aga gat ata gta aat ccc cat gct tat 2144 Gly His Tyr Tyr Ser Phe Ile Arg Asp Ile Val Asn Pro His Ala Tyr 565 570 575 580 aaa aac aat aaa tgg tat ctt ttt aat gat gct gag gta aaa cct ttt 2192 Lys Asn Asn Lys Trp Tyr Leu Phe Asn Asp Ala Glu Val Lys Pro Phe 585 590 595 gat tct gct caa ctt gca tct gaa tgt ttt ggt gga gag atg acg acc 2240 Asp Ser Ala Gln Leu Ala Ser Glu Cys Phe Gly Gly Glu Met Thr Thr 600 605 610 aag acc tat gat tct gtt aca gat aaa ttt atg gac ttc tct ttt gaa 2288 Lys Thr Tyr Asp Ser Val Thr Asp Lys Phe Met Asp Phe Ser Phe Glu 615 620 625 aag aca cac agt gca tat atg ctg ttt tac aaa cgc atg gaa cca gag 2336 Lys Thr His Ser Ala Tyr Met Leu Phe Tyr Lys Arg Met Glu Pro Glu 630 635 640 gaa gaa aat ggc aga gaa tac aaa ttt gat gtt tcg tca gag tta cta 2384 Glu Glu Asn Gly Arg Glu Tyr Lys Phe Asp Val Ser Ser Glu Leu Leu 645 650 655 660 gag tgg att tgg cat gat aac atg cag ttt ctt caa gac aaa aac att 2432 Glu Trp Ile Trp His Asp Asn Met Gln Phe Leu Gln Asp Lys Asn Ile 665 670 675 ttt gaa cat aca tat ttt gga ttt atg tgg caa ttg tgt agt tgt att 2480 Phe Glu His Thr Tyr Phe Gly Phe Met Trp Gln Leu Cys Ser Cys Ile 680 685 690 ccc agt aca tta cca gat cct aaa gct gtg tcc tta atg aca gca aag 2528 Pro Ser Thr Leu Pro Asp Pro Lys Ala Val Ser Leu Met Thr Ala Lys 695 700 705 tta agc act tcc ttt gtc cta gag aca ttt att cat tct aaa gaa aag 2576 Leu Ser Thr Ser Phe Val Leu Glu Thr Phe Ile His Ser Lys Glu Lys 710 715 720 ccc acg atg ctt cag tgg att gaa ctg ttg acg aaa cag ttt aat aat 2624 Pro Thr Met Leu Gln Trp Ile Glu Leu Leu Thr Lys Gln Phe Asn Asn 725 730 735 740 agt cag gca gct tgt gag tgg ttt tta gat cgt atg gct gat gac gac 2672 Ser Gln Ala Ala Cys Glu Trp Phe Leu Asp Arg Met Ala Asp Asp Asp 745 750 755 tgg tgg cca atg cag ata cta att aag tgc cct aat caa att gtg aga 2720 Trp Trp Pro Met Gln Ile Leu Ile Lys Cys Pro Asn Gln Ile Val Arg 760 765 770 cag atg ttt cag cgt ttg tgt atc cat gtg att cag agg ctg aga cct 2768 Gln Met Phe Gln Arg Leu Cys Ile His Val Ile Gln Arg Leu Arg Pro 775 780 785 gtg cat gct cat ctc tat ttg cag cca gga atg gaa gat ggg tca gat 2816 Val His Ala His Leu Tyr Leu Gln Pro Gly Met Glu Asp Gly Ser Asp 790 795 800 gat atg gat acc tca gta gaa gat att ggt ggt cgt tca tgt gtc act 2864 Asp Met Asp Thr Ser Val Glu Asp Ile Gly Gly Arg Ser Cys Val Thr 805 810 815 820 cgc ttt gtg aga acc ctg tta tta att atg gaa cat ggt gta aaa cct 2912 Arg Phe Val Arg Thr Leu Leu Leu Ile Met Glu His Gly Val Lys Pro 825 830 835 cac agt aaa cat ctt aca gag tat ttt gcc ttc ctt tac gaa ttt gca 2960 His Ser Lys His Leu Thr Glu Tyr Phe Ala Phe Leu Tyr Glu Phe Ala 840 845 850 aaa atg ggt gaa gaa gag agc caa ttt ttg ctt tca ttg caa gct ata 3008 Lys Met Gly Glu Glu Glu Ser Gln Phe Leu Leu Ser Leu Gln Ala Ile 855 860 865 tct aca atg gta cat ttt tac atg gga aca aaa gga cct gaa aat cct 3056 Ser Thr Met Val His Phe Tyr Met Gly Thr Lys Gly Pro Glu Asn Pro 870 875 880 caa gtt gaa gtg tta tca gag gaa gaa ggg gaa gaa gaa gag gag gaa 3104 Gln Val Glu Val Leu Ser Glu Glu Glu Gly Glu Glu Glu Glu Glu Glu 885 890 895 900 gaa gat atc ctc tct ctg gca gaa gaa aaa tac agg cca gct gcc ctt 3152 Glu Asp Ile Leu Ser Leu Ala Glu Glu Lys Tyr Arg Pro Ala Ala Leu 905 910 915 gaa aag atg ata gct tta gtt gct ctt ttg gtt gaa cag tct cga tca 3200 Glu Lys Met Ile Ala Leu Val Ala Leu Leu Val Glu Gln Ser Arg Ser 920 925 930 gaa agg tgaaatgttt cgaatttaaa atgtttaaag catgtttggt tttattattt 3256 Glu Arg ttacataatt gtttaccact agtttttcca ctagcttttt attatatatg tttaattatg 3316 taattgttat tcactagctt ttattatata aatcctttta aataatacta ctattcatca 3376 actcttgtgg cataagaatt tcagtttttt ctaccaaact tttacttcat ctatgagtcg 3436 tgttagaaat agtcattgaa aaaatataca gtaaaatatc taaaaaaaaa aaaaaagg 3494 75 934 PRT Homo sapiens 75 Met Ser Val Ala Tyr Thr Tyr Asp Asn Leu Ala Pro Arg Val Leu Lys 1 5 10 15 Ala Gln Ser Asp His Arg Ser Arg His Glu Val Ser His Tyr Ser Met 20 25 30 Trp Leu Leu Val Ser Trp Ala His Cys Cys Ser Leu Val Lys Ser Ser 35 40 45 Leu Ala Asp Ser Asp His Leu Gln Asp Trp Leu Lys Lys Leu Thr Leu 50 55 60 Leu Ile Pro Glu Thr Ala Val Arg His Glu Ser Cys Ser Gly Leu Tyr 65 70 75 80 Lys Leu Ser Leu Ser Gly Leu Asp Gly Gly Asp Ser Ile Asn Arg Ser 85 90 95 Phe Leu Leu Leu Ala Ala Ser Thr Leu Leu Lys Phe Leu Pro Asp Ala 100 105 110 Gln Ala Leu Lys Pro Ile Arg Ile Asp Asp Tyr Glu Glu Glu Pro Ile 115 120 125 Leu Lys Pro Gly Cys Lys Glu Tyr Phe Trp Leu Leu Cys Lys Leu Val 130 135 140 Asp Asn Ile His Ile Lys Asp Ala Ser Gln Thr Thr Leu Leu Asp Leu 145 150 155 160 Asp Ala Leu Ala Arg His Leu Ala Asp Cys Ile Arg Ser Arg Glu Ile 165 170 175 Leu Asp His Gln Asp Gly Asn Val Glu Asp Asp Gly Leu Thr Gly Leu 180 185 190 Leu Arg Leu Ala Thr Ser Val Val Lys His Lys Pro Pro Phe Lys Phe 195 200 205 Ser Arg Glu Gly Gln Glu Phe Leu Arg Asp Ile Phe Asn Leu Leu Phe 210 215 220 Leu Leu Pro Ser Leu Lys Asp Arg Gln Gln Pro Lys Cys Lys Ser His 225 230 235 240 Ser Thr Arg Ala Ala Ala Tyr Asp Leu Leu Val Glu Met Val Lys Gly 245 250 255 Ser Val Glu Asn Tyr Arg Leu Ile His Asn Trp Val Met Ala Gln His 260 265 270 Met Gln Ser His Ala Pro Tyr Lys Trp Asp Tyr Trp Pro His Glu Asp 275 280 285 Val Arg Ala Glu Cys Arg Phe Val Gly Leu Thr Asn Leu Gly Ala Thr 290 295 300 Cys Tyr Leu Ala Ser Thr Ile Gln Gln Leu Tyr Met Ile Pro Glu Ala 305 310 315 320 Arg Gln Ala Val Phe Thr Ala Lys Tyr Ser Glu Asp Met Lys His Lys 325 330 335 Thr Thr Leu Leu Glu Leu Gln Lys Met Phe Thr Tyr Leu Met Glu Ser 340 345 350 Glu Cys Lys Ala Tyr Asn Pro Arg Pro Phe Cys Lys Thr Tyr Thr Met 355 360 365 Asp Lys Gln Pro Leu Asn Thr Gly Glu Gln Lys Asp Met Thr Glu Phe 370 375 380 Phe Thr Asp Leu Ile Thr Lys Ile Glu Glu Met Ser Pro Glu Leu Lys 385 390 395 400 Asn Thr Val Lys Ser Leu Phe Gly Gly Val Ile Thr Asn Asn Val Val 405 410 415 Ser Leu Asp Cys Glu His Val Ser Gln Thr Ala Glu Glu Phe Tyr Thr 420 425 430 Val Arg Cys Gln Val Ala Asp Met Lys Asn Ile Tyr Glu Ser Leu Asp 435 440 445 Glu Val Thr Ile Lys Asp Thr Leu Glu Gly Asp Asn Met Tyr Thr Cys 450 455 460 Ser His Cys Gly Lys Lys Val Arg Ala Glu Lys Arg Ala Cys Phe Lys 465 470 475 480 Lys Leu Pro Arg Ile Leu Ser Phe Asn Thr Met Arg Tyr Thr Phe Asn 485 490 495 Met Val Thr Met Met Lys Glu Lys Val Asn Thr His Phe Ser Phe Pro 500 505 510 Leu Arg Leu Asp Met Thr Pro Tyr Thr Glu Asp Phe Leu Met Gly Lys 515 520 525 Ser Glu Arg Lys Glu Gly Phe Lys Glu Val Ser Asp His Ser Lys Asp 530 535 540 Ser Glu Ser Tyr Glu Tyr Asp Leu Ile Gly Val Thr Val His Thr Gly 545 550 555 560 Thr Ala Asp Gly Gly His Tyr Tyr Ser Phe Ile Arg Asp Ile Val Asn 565 570 575 Pro His Ala Tyr Lys Asn Asn Lys Trp Tyr Leu Phe Asn Asp Ala Glu 580 585 590 Val Lys Pro Phe Asp Ser Ala Gln Leu Ala Ser Glu Cys Phe Gly Gly 595 600 605 Glu Met Thr Thr Lys Thr Tyr Asp Ser Val Thr Asp Lys Phe Met Asp 610 615 620 Phe Ser Phe Glu Lys Thr His Ser Ala Tyr Met Leu Phe Tyr Lys Arg 625 630 635 640 Met Glu Pro Glu Glu Glu Asn Gly Arg Glu Tyr Lys Phe Asp Val Ser 645 650 655 Ser Glu Leu Leu Glu Trp Ile Trp His Asp Asn Met Gln Phe Leu Gln 660 665 670 Asp Lys Asn Ile Phe Glu His Thr Tyr Phe Gly Phe Met Trp Gln Leu 675 680 685 Cys Ser Cys Ile Pro Ser Thr Leu Pro Asp Pro Lys Ala Val Ser Leu 690 695 700 Met Thr Ala Lys Leu Ser Thr Ser Phe Val Leu Glu Thr Phe Ile His 705 710 715 720 Ser Lys Glu Lys Pro Thr Met Leu Gln Trp Ile Glu Leu Leu Thr Lys 725 730 735 Gln Phe Asn Asn Ser Gln Ala Ala Cys Glu Trp Phe Leu Asp Arg Met 740 745 750 Ala Asp Asp Asp Trp Trp Pro Met Gln Ile Leu Ile Lys Cys Pro Asn 755 760 765 Gln Ile Val Arg Gln Met Phe Gln Arg Leu Cys Ile His Val Ile Gln 770 775 780 Arg Leu Arg Pro Val His Ala His Leu Tyr Leu Gln Pro Gly Met Glu 785 790 795 800 Asp Gly Ser Asp Asp Met Asp Thr Ser Val Glu Asp Ile Gly Gly Arg 805 810 815 Ser Cys Val Thr Arg Phe Val Arg Thr Leu Leu Leu Ile Met Glu His 820 825 830 Gly Val Lys Pro His Ser Lys His Leu Thr Glu Tyr Phe Ala Phe Leu 835 840 845 Tyr Glu Phe Ala Lys Met Gly Glu Glu Glu Ser Gln Phe Leu Leu Ser 850 855 860 Leu Gln Ala Ile Ser Thr Met Val His Phe Tyr Met Gly Thr Lys Gly 865 870 875 880 Pro Glu Asn Pro Gln Val Glu Val Leu Ser Glu Glu Glu Gly Glu Glu 885 890 895 Glu Glu Glu Glu Glu Asp Ile Leu Ser Leu Ala Glu Glu Lys Tyr Arg 900 905 910 Pro Ala Ala Leu Glu Lys Met Ile Ala Leu Val Ala Leu Leu Val Glu 915 920 925 Gln Ser Arg Ser Glu Arg 930 76 2805 DNA Homo sapiens 76 atgtctgttg cttatacgta tgataatctg gctcctagag ttttaaaagc tcagtctgat 60 cacaggtcta gacatgaagt ttcacattat tcaatgtggc tcttggtgag ttgggctcat 120 tgctgttctt tagtgaaatc tagccttgct gatagcgatc atttacaaga ttggctaaag 180 aaattgactc tccttattcc tgagactgca gttcgtcatg aatcatgcag tggtctctat 240 aagttatccc tgtcagggct ggatggagga gactcaatca atcgttcttt tctgctattg 300 gctgcctcaa cattattgaa atttcttcct gatgctcaag cactcaaacc tattaggata 360 gatgattatg aggaagaacc aatattaaaa ccaggatgta aagagtattt ttggttgtta 420 tgcaaattag ttgacaacat acatataaag gacgctagtc agacaacgct cctcgactta 480 gatgccttgg caagacattt ggctgactgt attcgaagta gggagatcct tgatcatcag 540 gatggtaatg tagaagatga tgggcttaca ggactcctaa ggcttgcaac aagtgttgtt 600 aaacacaaac caccctttaa attttcaagg gaaggacagg aatttttgag agatatcttc 660 aatctcctgt ttttgttgcc aagtctaaag gaccgacaac agccaaagtg caaatcacat 720 tctacaagag ctgccgctta cgatttgtta gtagagatgg taaaggggtc tgttgagaac 780 tacaggctaa tacacaactg ggttatggca caacacatgc agtcccatgc accttataaa 840 tgggattact ggcctcatga agatgtccgt gctgaatgta gatttgttgg ccttactaac 900 cttggagcta cttgttactt agcttctact attcagcaac tttatatgat acctgaggca 960 agacaggctg tcttcactgc caagtattca gaggatatga agcacaagac cactcttctg 1020 gagcttcaga aaatgtttac atatttaatg gagagtgaat gcaaagcata taatcctaga 1080 cctttctgta aaacatacac catggataag cagcctctga atactgggga acagaaagat 1140 atgacagagt tttttactga tctaattacc aaaatcgaag aaatgtctcc cgaactgaaa 1200 aataccgtca aaagtttatt tggaggtgta attacaaaca atgttgtatc cttggattgt 1260 gaacatgtta gtcaaactgc tgaagagttt tatactgtga ggtgccaagt ggctgatatg 1320 aagaacattt atgaatctct tgatgaagtt actataaaag acactttgga aggtgataac 1380 atgtatactt gttctcattg tgggaagaaa gtacgagctg aaaaaagggc atgttttaag 1440 aaattgcctc gcattttgag tttcaatact atgagataca catttaatat ggtcacgatg 1500 atgaaagaga aagtgaatac acacttttcc ttcccattac gtttggacat gacgccctat 1560 acagaagatt ttcttatggg aaagagtgag aggaaagaag gttttaaaga agtcagtgat 1620 cattcaaaag actcagagag ctatgaatat gacttgatag gagtgactgt tcacacagga 1680 acggcagatg gtggacacta ttatagcttt atcagagata tagtaaatcc ccatgcttat 1740 aaaaacaata aatggtatct ttttaatgat gctgaggtaa aaccttttga ttctgctcaa 1800 cttgcatctg aatgttttgg tggagagatg acgaccaaga cctatgattc tgttacagat 1860 aaatttatgg acttctcttt tgaaaagaca cacagtgcat atatgctgtt ttacaaacgc 1920 atggaaccag aggaagaaaa tggcagagaa tacaaatttg atgtttcgtc agagttacta 1980 gagtggattt ggcatgataa catgcagttt cttcaagaca aaaacatttt tgaacataca 2040 tattttggat ttatgtggca attgtgtagt tgtattccca gtacattacc agatcctaaa 2100 gctgtgtcct taatgacagc aaagttaagc acttcctttg tcctagagac atttattcat 2160 tctaaagaaa agcccacgat gcttcagtgg attgaactgt tgacgaaaca gtttaataat 2220 agtcaggcag cttgtgagtg gtttttagat cgtatggctg atgacgactg gtggccaatg 2280 cagatactaa ttaagtgccc taatcaaatt gtgagacaga tgtttcagcg tttgtgtatc 2340 catgtgattc agaggctgag acctgtgcat gctcatctct atttgcagcc aggaatggaa 2400 gatgggtcag atgatatgga tacctcagta gaagatattg gtggtcgttc atgtgtcact 2460 cgctttgtga gaaccctgtt attaattatg gaacatggtg taaaacctca cagtaaacat 2520 cttacagagt attttgcctt cctttacgaa tttgcaaaaa tgggtgaaga agagagccaa 2580 tttttgcttt cattgcaagc tatatctaca atggtacatt tttacatggg aacaaaagga 2640 cctgaaaatc ctcaagttga agtgttatca gaggaagaag gggaagaaga agaggaggaa 2700 gaagatatcc tctctctggc agaagaaaaa tacaggccag ctgcccttga aaagatgata 2760 gctttagttg ctcttttggt tgaacagtct cgatcagaaa ggtga 2805 77 4873 DNA Homo sapiens CDS (85)...(3501) 77 ccacgcgtcc ggcctagtcc tgagaggctg ggccggcggc ggctgcggcg ggagaccggt 60 gacccgcggc tgggcgcctc ggcc atg act gcg gag ctg cag cag gac gac 111 Met Thr Ala Glu Leu Gln Gln Asp Asp 1 5 gcg gcc ggc gcg gca gac ggc cac ggc tcg agc tgc caa atg ctg tta 159 Ala Ala Gly Ala Ala Asp Gly His Gly Ser Ser Cys Gln Met Leu Leu 10 15 20 25 aat caa ctg aga gaa atc aca ggc att cag gac cct tcc ttt ctc cat 207 Asn Gln Leu Arg Glu Ile Thr Gly Ile Gln Asp Pro Ser Phe Leu His 30 35 40 gaa gct ctg aag gcc agt aat ggt gac att act cag gca gtc agc ctt 255 Glu Ala Leu Lys Ala Ser Asn Gly Asp Ile Thr Gln Ala Val Ser Leu 45 50 55 ctc act gat gag aga gtt aag gag ccc agt caa gac act gtt gct aca 303 Leu Thr Asp Glu Arg Val Lys Glu Pro Ser Gln Asp Thr Val Ala Thr 60 65 70 gaa cca tct gaa gta gag ggg agt gct gcc aac aag gaa gta tta gca 351 Glu Pro Ser Glu Val Glu Gly Ser Ala Ala Asn Lys Glu Val Leu Ala 75 80 85 aaa gtt ata gac ctt act cat gat aac aaa gat gat ctt cag gct gcc 399 Lys Val Ile Asp Leu Thr His Asp Asn Lys Asp Asp Leu Gln Ala Ala 90 95 100 105 att gct ttg agt cta ctg gag tct ccc aaa att caa gct gat gga aga 447 Ile Ala Leu Ser Leu Leu Glu Ser Pro Lys Ile Gln Ala Asp Gly Arg 110 115 120 gat ctt aac agg atg cat gaa gca acc tct gca gaa act aaa cgc tca 495 Asp Leu Asn Arg Met His Glu Ala Thr Ser Ala Glu Thr Lys Arg Ser 125 130 135 aag aga aaa cgc tgt gaa gtc tgg gga gaa aac ccc aat ccc aat gac 543 Lys Arg Lys Arg Cys Glu Val Trp Gly Glu Asn Pro Asn Pro Asn Asp 140 145 150 tgg agg aga gtt gat ggt tgg cca gtt ggg ctg aaa aat gtt ggc aat 591 Trp Arg Arg Val Asp Gly Trp Pro Val Gly Leu Lys Asn Val Gly Asn 155 160 165 aca tgt tgg ttt agt gct gtt att cag tct ctc ttt caa ttg cct gaa 639 Thr Cys Trp Phe Ser Ala Val Ile Gln Ser Leu Phe Gln Leu Pro Glu 170 175 180 185 ttt cga aga ctt gtt ctc agt tat agt ctg cca caa aat gta ctt gaa 687 Phe Arg Arg Leu Val Leu Ser Tyr Ser Leu Pro Gln Asn Val Leu Glu 190 195 200 aat tgt cga agt cat aca gaa aag aga aat atc atg ttt atg caa gag 735 Asn Cys Arg Ser His Thr Glu Lys Arg Asn Ile Met Phe Met Gln Glu 205 210 215 ctt cag tat ttg ttt gct cta atg atg gga tca aat aga aaa ttt gta 783 Leu Gln Tyr Leu Phe Ala Leu Met Met Gly Ser Asn Arg Lys Phe Val 220 225 230 gac ccg tct gca gcc ctg gat cta tta aag gga gca ttc cga tca tct 831 Asp Pro Ser Ala Ala Leu Asp Leu Leu Lys Gly Ala Phe Arg Ser Ser 235 240 245 gag gaa cag cag caa gat gtg agt gaa ttc aca cac aag ctc ctg gat 879 Glu Glu Gln Gln Gln Asp Val Ser Glu Phe Thr His Lys Leu Leu Asp 250 255 260 265 tgg cta gag gac gca ttc cag cta gct gtt aat gtt aac agt ccc agg 927 Trp Leu Glu Asp Ala Phe Gln Leu Ala Val Asn Val Asn Ser Pro Arg 270 275 280 aac aaa tct gaa aat cca atg gtg cag ctg ttc tat ggt act ttc ctg 975 Asn Lys Ser Glu Asn Pro Met Val Gln Leu Phe Tyr Gly Thr Phe Leu 285 290 295 act gaa ggg gtt cgt gaa gga aaa ccc ttt tgt aac aat gag acc ttc 1023 Thr Glu Gly Val Arg Glu Gly Lys Pro Phe Cys Asn Asn Glu Thr Phe 300 305 310 ggc cag tat cct ctt cag gta aac ggt tat cgc aac tta gac gag tgt 1071 Gly Gln Tyr Pro Leu Gln Val Asn Gly Tyr Arg Asn Leu Asp Glu Cys 315 320 325 ttg gaa ggg gcc atg gtg gag ggt gat gtt gag ctt ctt ccc tcc gat 1119 Leu Glu Gly Ala Met Val Glu Gly Asp Val Glu Leu Leu Pro Ser Asp 330 335 340 345 cac tcg gtg aag tat gga caa gag cgt tgg ttt aca aag cta cct cca 1167 His Ser Val Lys Tyr Gly Gln Glu Arg Trp Phe Thr Lys Leu Pro Pro 350 355 360 gtg ttg acc ttt gaa ctc tca aga ttt gag ttt aat cag tcc ctt ggg 1215 Val Leu Thr Phe Glu Leu Ser Arg Phe Glu Phe Asn Gln Ser Leu Gly 365 370 375 cag cca gag aaa att cac aat aag ctg gaa ttt cct cag att att tat 1263 Gln Pro Glu Lys Ile His Asn Lys Leu Glu Phe Pro Gln Ile Ile Tyr 380 385 390 atg gac agg tac atg tac agg agc aag gag ctt att cga aat aag aga 1311 Met Asp Arg Tyr Met Tyr Arg Ser Lys Glu Leu Ile Arg Asn Lys Arg 395 400 405 gag tgt att cga aag ttg aag gag gaa ata aaa att ctg cag caa aaa 1359 Glu Cys Ile Arg Lys Leu Lys Glu Glu Ile Lys Ile Leu Gln Gln Lys 410 415 420 425 ttg gaa agg tat gtg aaa tat ggc tca ggc cca gct cgg ttc ccg ctc 1407 Leu Glu Arg Tyr Val Lys Tyr Gly Ser Gly Pro Ala Arg Phe Pro Leu 430 435 440 ccg gac atg ctg aaa tat gtt att gaa ttt gct agt aca aaa cct gcc 1455 Pro Asp Met Leu Lys Tyr Val Ile Glu Phe Ala Ser Thr Lys Pro Ala 445 450 455 tca gaa agc tgt cca cct gaa agt gac aca cat atg aca tta cca ctt 1503 Ser Glu Ser Cys Pro Pro Glu Ser Asp Thr His Met Thr Leu Pro Leu 460 465 470 tct tca gtg cac tgc tcg gtt tct gac cag aca tcc aag gaa agt aca 1551 Ser Ser Val His Cys Ser Val Ser Asp Gln Thr Ser Lys Glu Ser Thr 475 480 485 agt aca gaa agc tct tct cag gat gtt gaa agt acc ttt tct tct cct 1599 Ser Thr Glu Ser Ser Ser Gln Asp Val Glu Ser Thr Phe Ser Ser Pro 490 495 500 505 gaa gat tct tta ccc aag tct aaa cca ctg aca tct tct cgg tct tcc 1647 Glu Asp Ser Leu Pro Lys Ser Lys Pro Leu Thr Ser Ser Arg Ser Ser 510 515 520 atg gaa atg cct tca cag cca gct cca cga aca gtc aca gat gag gag 1695 Met Glu Met Pro Ser Gln Pro Ala Pro Arg Thr Val Thr Asp Glu Glu 525 530 535 ata aat ttt gtt aag acc tgt ctt cag aga tgg agg agt gag att gaa 1743 Ile Asn Phe Val Lys Thr Cys Leu Gln Arg Trp Arg Ser Glu Ile Glu 540 545 550 caa gat ata caa gat tta aag act tgt att gca agt act act cag act 1791 Gln Asp Ile Gln Asp Leu Lys Thr Cys Ile Ala Ser Thr Thr Gln Thr 555 560 565 att gaa cag atg tac tgc gat cct ctc ctt cgt cag gtg cct tat cgc 1839 Ile Glu Gln Met Tyr Cys Asp Pro Leu Leu Arg Gln Val Pro Tyr Arg 570 575 580 585 ttg cat gca gtt ctt gtt cat gaa gga caa gca aat gct gga cac tat 1887 Leu His Ala Val Leu Val His Glu Gly Gln Ala Asn Ala Gly His Tyr 590 595 600 tgg gcc tat atc tat aat caa ccc cga cag agc tgg ctc aag tac aat 1935 Trp Ala Tyr Ile Tyr Asn Gln Pro Arg Gln Ser Trp Leu Lys Tyr Asn 605 610 615 gac atc tct gtt act gaa tct tcc tgg gaa gaa gtt gaa aga gat tcc 1983 Asp Ile Ser Val Thr Glu Ser Ser Trp Glu Glu Val Glu Arg Asp Ser 620 625 630 tat gga ggc ctg aga aat gtt agt gct tac tgt ctg atg tac att aat 2031 Tyr Gly Gly Leu Arg Asn Val Ser Ala Tyr Cys Leu Met Tyr Ile Asn 635 640 645 gac aaa cta ccc tac ttc aat gca gag gca gcc cca act gaa tca gat 2079 Asp Lys Leu Pro Tyr Phe Asn Ala Glu Ala Ala Pro Thr Glu Ser Asp 650 655 660 665 caa atg tca gaa gtg gaa gcc cta tct gtg gaa ctc aag cat tac att 2127 Gln Met Ser Glu Val Glu Ala Leu Ser Val Glu Leu Lys His Tyr Ile 670 675 680 cag gag gat aac tgg cgg ttt gag cag gaa gta gag gag tgg gaa gaa 2175 Gln Glu Asp Asn Trp Arg Phe Glu Gln Glu Val Glu Glu Trp Glu Glu 685 690 695 gag cag tct tgc aaa atc cct caa atg gag tcc tcc acc aac tcc tca 2223 Glu Gln Ser Cys Lys Ile Pro Gln Met Glu Ser Ser Thr Asn Ser Ser 700 705 710 tca cag gac tac tct aca tca caa gag cct tca gta gcc tct tct cat 2271 Ser Gln Asp Tyr Ser Thr Ser Gln Glu Pro Ser Val Ala Ser Ser His 715 720 725 ggg gtt cgc tgc ttg tcg tct gag cat gct gtg att gta aag gag caa 2319 Gly Val Arg Cys Leu Ser Ser Glu His Ala Val Ile Val Lys Glu Gln 730 735 740 745 act gcc cag gct att gca aac aca gcc cgt gcc tat gag aag agc ggt 2367 Thr Ala Gln Ala Ile Ala Asn Thr Ala Arg Ala Tyr Glu Lys Ser Gly 750 755 760 gta gaa gcg gca ctg agt gag gtt aaa gaa gct gaa ccc aag aag ccc 2415 Val Glu Ala Ala Leu Ser Glu Val Lys Glu Ala Glu Pro Lys Lys Pro 765 770 775 atg ccc cag gaa aca aac ctt gca gag cag tca gaa cag ccc cca aag 2463 Met Pro Gln Glu Thr Asn Leu Ala Glu Gln Ser Glu Gln Pro Pro Lys 780 785 790 gct aat gat gca gag tct act gcc cag cct aat tct gag gtc tct gaa 2511 Ala Asn Asp Ala Glu Ser Thr Ala Gln Pro Asn Ser Glu Val Ser Glu 795 800 805 gtc gag att ccc agt gtg gga agg att ctg gtt aga tct gat gca gat 2559 Val Glu Ile Pro Ser Val Gly Arg Ile Leu Val Arg Ser Asp Ala Asp 810 815 820 825 gga tat gat gag gag gtg atg ctg agc cct gcc atg caa ggg gtc atc 2607 Gly Tyr Asp Glu Glu Val Met Leu Ser Pro Ala Met Gln Gly Val Ile 830 835 840 ctg gcc ata gct aaa gcc cgt cag acc ttt gac cga gat ggg tct gaa 2655 Leu Ala Ile Ala Lys Ala Arg Gln Thr Phe Asp Arg Asp Gly Ser Glu 845 850 855 gca ggg ctg att aag gca ttc cat gaa gaa tac tcc agg ctc tat cag 2703 Ala Gly Leu Ile Lys Ala Phe His Glu Glu Tyr Ser Arg Leu Tyr Gln 860 865 870 ctt gcc aaa gag acc ccc acc tct cac agt gat cct cga ctt cag cat 2751 Leu Ala Lys Glu Thr Pro Thr Ser His Ser Asp Pro Arg Leu Gln His 875 880 885 gtc ctt gtc tac ttt ttc caa aat gaa gca ccc aaa agg gta gta gaa 2799 Val Leu Val Tyr Phe Phe Gln Asn Glu Ala Pro Lys Arg Val Val Glu 890 895 900 905 cga acc ctt ctg gaa cag ttt gca gat aaa aat ctt agc tat gat gaa 2847 Arg Thr Leu Leu Glu Gln Phe Ala Asp Lys Asn Leu Ser Tyr Asp Glu 910 915 920 aga tca atc agc att atg aag gtg gct caa gcg aaa ctg aag gaa att 2895 Arg Ser Ile Ser Ile Met Lys Val Ala Gln Ala Lys Leu Lys Glu Ile 925 930 935 ggt cca gat gac atg aat atg gaa gag tac aag aag tgg cat gaa gat 2943 Gly Pro Asp Asp Met Asn Met Glu Glu Tyr Lys Lys Trp His Glu Asp 940 945 950 tat agt ttg ttc cga aaa gtg tct gtg tat ctc cta aca ggc cta gaa 2991 Tyr Ser Leu Phe Arg Lys Val Ser Val Tyr Leu Leu Thr Gly Leu Glu 955 960 965 ctc tat caa aaa gga aag tac caa gag gca ctt tcc tac ctg gta tat 3039 Leu Tyr Gln Lys Gly Lys Tyr Gln Glu Ala Leu Ser Tyr Leu Val Tyr 970 975 980 985 gcc tac cag agc aat gct gcc ctg ctg atg aag ggg ccc cgc cgg ggg 3087 Ala Tyr Gln Ser Asn Ala Ala Leu Leu Met Lys Gly Pro Arg Arg Gly 990 995 1000 gtc aaa gaa tcc gtg att gct tta tac cga aga aaa tgc ctt ctg gag 3135 Val Lys Glu Ser Val Ile Ala Leu Tyr Arg Arg Lys Cys Leu Leu Glu 1005 1010 1015 ctg aat gcc aaa gca gct tct ctt ttt gaa aca aat gat gat cac tcc 3183 Leu Asn Ala Lys Ala Ala Ser Leu Phe Glu Thr Asn Asp Asp His Ser 1020 1025 1030 gta act gag ggc att aat gtg atg aat gaa ctg atc atc ccc tgc att 3231 Val Thr Glu Gly Ile Asn Val Met Asn Glu Leu Ile Ile Pro Cys Ile 1035 1040 1045 cac ctt atc att aat aat gac att tcc aag gat gat ctg gat gcc att 3279 His Leu Ile Ile Asn Asn Asp Ile Ser Lys Asp Asp Leu Asp Ala Ile 1050 1055 1060 1065 gag gtc atg aga aac cat tgg tgc tct tac ctt ggg caa gat att gca 3327 Glu Val Met Arg Asn His Trp Cys Ser Tyr Leu Gly Gln Asp Ile Ala 1070 1075 1080 gaa aat ctg cag ctg tgc cta ggg gag ttt cta ccc aga ctt cta gat 3375 Glu Asn Leu Gln Leu Cys Leu Gly Glu Phe Leu Pro Arg Leu Leu Asp 1085 1090 1095 cct tct gca gaa atc atc gtc ttg aaa gag cct cca act att cga ccc 3423 Pro Ser Ala Glu Ile Ile Val Leu Lys Glu Pro Pro Thr Ile Arg Pro 1100 1105 1110 aat tct ccc tat gac cta tgt agc cga ttt gca gct gtc atg gag tca 3471 Asn Ser Pro Tyr Asp Leu Cys Ser Arg Phe Ala Ala Val Met Glu Ser 1115 1120 1125 att cag gga gtt tca act gtg aca gtg aaa taagctccca catgttcaag 3521 Ile Gln Gly Val Ser Thr Val Thr Val Lys 1130 1135 gcccattctg gttcctggct gcctgcctct tgcacagaag ttcgttgtca tagtgctcac 3581 cttgggaaaa ggattaggtg ggcacataag attccgatca gaccccaacc atgctgcatg 3641 tgtaaagaag gattgaaaat aaaattgcac tttttaggta caaaatcata aaagctgttt 3701 cactagaaaa ggcagaaagc agtgtattaa ggtgttgaat tacgccagaa gacctgaaat 3761 gccttgtacc tacaacaatg cttaggcttt tctaagcctc ttgccacttt taaaattatc 3821 cttcaggcat aaatattttt gacagcagaa tagaagaatg attcatgaga acctgaacca 3881 gatgaacagc tactagttat tttatcaaat acagatgaca tttaaaaatt cttaactaca 3941 agagattaga aatataaacc ttgcctggct cttgccagga gataacaaaa tgggttgctg 4001 atgaactgca cccttttaca tgtgggtaga atataagctc acatggcagt gagatgttga 4061 aaagtcaaaa gagacctgtc tctctccttt cttttctatc tttaaaccag aaaacctcat 4121 actcagtcct cagtgaaaga aagtaaagta ttaaggactt taggcagaag agcattgtgt 4181 aacttgactg aagatcatcc attaatagtt attaggcatt taggtaaaat tttctaatac 4241 ctaaaaattg tcaaaaacag tcaatagggc tactgctggc ccaaagacca tttaggtcca 4301 cctcctcttt tttgctcttt tttttttttc tgtgacagtt tcactgtgtc gcccaggctg 4361 gcgttcagtg gtgcaatctc agctcactgc aaactctgtc tcctgggctc aagtgattct 4421 cgtgcctcag cctcccgaat agctggaatt acgggcatgc accaccacac ctggctaatt 4481 tttgtatttt taatagagat ggggtttcac catattggcc aggctgatct ctaactcctg 4541 gcctcaagtg atctatctgc ctccctcagc ctcccaaagt ctgggattgc agacaagtca 4601 tcgtacccgg ccttcttttt tgcccttaaa agtaagggat gtgggtttgt acaaaaaaaa 4661 aaaaaaaaaa aaaaaaaaac cagcatacat atgcaaaact atatatatat gtatatgtag 4721 agaaaaatac ttcccattga tcatttttaa aaggcttctg attggatatt gtgttttaac 4781 caaattttaa agattaatgg aatcatgaaa gggaaaaaat tgatacaact atgcagattt 4841 tataaatgtg caataaaagt atttgtttta ca 4873 78 1139 PRT Homo sapiens 78 Met Thr Ala Glu Leu Gln Gln Asp Asp Ala Ala Gly Ala Ala Asp Gly 1 5 10 15 His Gly Ser Ser Cys Gln Met Leu Leu Asn Gln Leu Arg Glu Ile Thr 20 25 30 Gly Ile Gln Asp Pro Ser Phe Leu His Glu Ala Leu Lys Ala Ser Asn 35 40 45 Gly Asp Ile Thr Gln Ala Val Ser Leu Leu Thr Asp Glu Arg Val Lys 50 55 60 Glu Pro Ser Gln Asp Thr Val Ala Thr Glu Pro Ser Glu Val Glu Gly 65 70 75 80 Ser Ala Ala Asn Lys Glu Val Leu Ala Lys Val Ile Asp Leu Thr His 85 90 95 Asp Asn Lys Asp Asp Leu Gln Ala Ala Ile Ala Leu Ser Leu Leu Glu 100 105 110 Ser Pro Lys Ile Gln Ala Asp Gly Arg Asp Leu Asn Arg Met His Glu 115 120 125 Ala Thr Ser Ala Glu Thr Lys Arg Ser Lys Arg Lys Arg Cys Glu Val 130 135 140 Trp Gly Glu Asn Pro Asn Pro Asn Asp Trp Arg Arg Val Asp Gly Trp 145 150 155 160 Pro Val Gly Leu Lys Asn Val Gly Asn Thr Cys Trp Phe Ser Ala Val 165 170 175 Ile Gln Ser Leu Phe Gln Leu Pro Glu Phe Arg Arg Leu Val Leu Ser 180 185 190 Tyr Ser Leu Pro Gln Asn Val Leu Glu Asn Cys Arg Ser His Thr Glu 195 200 205 Lys Arg Asn Ile Met Phe Met Gln Glu Leu Gln Tyr Leu Phe Ala Leu 210 215 220 Met Met Gly Ser Asn Arg Lys Phe Val Asp Pro Ser Ala Ala Leu Asp 225 230 235 240 Leu Leu Lys Gly Ala Phe Arg Ser Ser Glu Glu Gln Gln Gln Asp Val 245 250 255 Ser Glu Phe Thr His Lys Leu Leu Asp Trp Leu Glu Asp Ala Phe Gln 260 265 270 Leu Ala Val Asn Val Asn Ser Pro Arg Asn Lys Ser Glu Asn Pro Met 275 280 285 Val Gln Leu Phe Tyr Gly Thr Phe Leu Thr Glu Gly Val Arg Glu Gly 290 295 300 Lys Pro Phe Cys Asn Asn Glu Thr Phe Gly Gln Tyr Pro Leu Gln Val 305 310 315 320 Asn Gly Tyr Arg Asn Leu Asp Glu Cys Leu Glu Gly Ala Met Val Glu 325 330 335 Gly Asp Val Glu Leu Leu Pro Ser Asp His Ser Val Lys Tyr Gly Gln 340 345 350 Glu Arg Trp Phe Thr Lys Leu Pro Pro Val Leu Thr Phe Glu Leu Ser 355 360 365 Arg Phe Glu Phe Asn Gln Ser Leu Gly Gln Pro Glu Lys Ile His Asn 370 375 380 Lys Leu Glu Phe Pro Gln Ile Ile Tyr Met Asp Arg Tyr Met Tyr Arg 385 390 395 400 Ser Lys Glu Leu Ile Arg Asn Lys Arg Glu Cys Ile Arg Lys Leu Lys 405 410 415 Glu Glu Ile Lys Ile Leu Gln Gln Lys Leu Glu Arg Tyr Val Lys Tyr 420 425 430 Gly Ser Gly Pro Ala Arg Phe Pro Leu Pro Asp Met Leu Lys Tyr Val 435 440 445 Ile Glu Phe Ala Ser Thr Lys Pro Ala Ser Glu Ser Cys Pro Pro Glu 450 455 460 Ser Asp Thr His Met Thr Leu Pro Leu Ser Ser Val His Cys Ser Val 465 470 475 480 Ser Asp Gln Thr Ser Lys Glu Ser Thr Ser Thr Glu Ser Ser Ser Gln 485 490 495 Asp Val Glu Ser Thr Phe Ser Ser Pro Glu Asp Ser Leu Pro Lys Ser 500 505 510 Lys Pro Leu Thr Ser Ser Arg Ser Ser Met Glu Met Pro Ser Gln Pro 515 520 525 Ala Pro Arg Thr Val Thr Asp Glu Glu Ile Asn Phe Val Lys Thr Cys 530 535 540 Leu Gln Arg Trp Arg Ser Glu Ile Glu Gln Asp Ile Gln Asp Leu Lys 545 550 555 560 Thr Cys Ile Ala Ser Thr Thr Gln Thr Ile Glu Gln Met Tyr Cys Asp 565 570 575 Pro Leu Leu Arg Gln Val Pro Tyr Arg Leu His Ala Val Leu Val His 580 585 590 Glu Gly Gln Ala Asn Ala Gly His Tyr Trp Ala Tyr Ile Tyr Asn Gln 595 600 605 Pro Arg Gln Ser Trp Leu Lys Tyr Asn Asp Ile Ser Val Thr Glu Ser 610 615 620 Ser Trp Glu Glu Val Glu Arg Asp Ser Tyr Gly Gly Leu Arg Asn Val 625 630 635 640 Ser Ala Tyr Cys Leu Met Tyr Ile Asn Asp Lys Leu Pro Tyr Phe Asn 645 650 655 Ala Glu Ala Ala Pro Thr Glu Ser Asp Gln Met Ser Glu Val Glu Ala 660 665 670 Leu Ser Val Glu Leu Lys His Tyr Ile Gln Glu Asp Asn Trp Arg Phe 675 680 685 Glu Gln Glu Val Glu Glu Trp Glu Glu Glu Gln Ser Cys Lys Ile Pro 690 695 700 Gln Met Glu Ser Ser Thr Asn Ser Ser Ser Gln Asp Tyr Ser Thr Ser 705 710 715 720 Gln Glu Pro Ser Val Ala Ser Ser His Gly Val Arg Cys Leu Ser Ser 725 730 735 Glu His Ala Val Ile Val Lys Glu Gln Thr Ala Gln Ala Ile Ala Asn 740 745 750 Thr Ala Arg Ala Tyr Glu Lys Ser Gly Val Glu Ala Ala Leu Ser Glu 755 760 765 Val Lys Glu Ala Glu Pro Lys Lys Pro Met Pro Gln Glu Thr Asn Leu 770 775 780 Ala Glu Gln Ser Glu Gln Pro Pro Lys Ala Asn Asp Ala Glu Ser Thr 785 790 795 800 Ala Gln Pro Asn Ser Glu Val Ser Glu Val Glu Ile Pro Ser Val Gly 805 810 815 Arg Ile Leu Val Arg Ser Asp Ala Asp Gly Tyr Asp Glu Glu Val Met 820 825 830 Leu Ser Pro Ala Met Gln Gly Val Ile Leu Ala Ile Ala Lys Ala Arg 835 840 845 Gln Thr Phe Asp Arg Asp Gly Ser Glu Ala Gly Leu Ile Lys Ala Phe 850 855 860 His Glu Glu Tyr Ser Arg Leu Tyr Gln Leu Ala Lys Glu Thr Pro Thr 865 870 875 880 Ser His Ser Asp Pro Arg Leu Gln His Val Leu Val Tyr Phe Phe Gln 885 890 895 Asn Glu Ala Pro Lys Arg Val Val Glu Arg Thr Leu Leu Glu Gln Phe 900 905 910 Ala Asp Lys Asn Leu Ser Tyr Asp Glu Arg Ser Ile Ser Ile Met Lys 915 920 925 Val Ala Gln Ala Lys Leu Lys Glu Ile Gly Pro Asp Asp Met Asn Met 930 935 940 Glu Glu Tyr Lys Lys Trp His Glu Asp Tyr Ser Leu Phe Arg Lys Val 945 950 955 960 Ser Val Tyr Leu Leu Thr Gly Leu Glu Leu Tyr Gln Lys Gly Lys Tyr 965 970 975 Gln Glu Ala Leu Ser Tyr Leu Val Tyr Ala Tyr Gln Ser Asn Ala Ala 980 985 990 Leu Leu Met Lys Gly Pro Arg Arg Gly Val Lys Glu Ser Val Ile Ala 995 1000 1005 Leu Tyr Arg Arg Lys Cys Leu Leu Glu Leu Asn Ala Lys Ala Ala Ser 1010 1015 1020 Leu Phe Glu Thr Asn Asp Asp His Ser Val Thr Glu Gly Ile Asn Val 1025 1030 1035 1040 Met Asn Glu Leu Ile Ile Pro Cys Ile His Leu Ile Ile Asn Asn Asp 1045 1050 1055 Ile Ser Lys Asp Asp Leu Asp Ala Ile Glu Val Met Arg Asn His Trp 1060 1065 1070 Cys Ser Tyr Leu Gly Gln Asp Ile Ala Glu Asn Leu Gln Leu Cys Leu 1075 1080 1085 Gly Glu Phe Leu Pro Arg Leu Leu Asp Pro Ser Ala Glu Ile Ile Val 1090 1095 1100 Leu Lys Glu Pro Pro Thr Ile Arg Pro Asn Ser Pro Tyr Asp Leu Cys 1105 1110 1115 1120 Ser Arg Phe Ala Ala Val Met Glu Ser Ile Gln Gly Val Ser Thr Val 1125 1130 1135 Thr Val Lys 79 3420 DNA Homo sapiens 79 atgactgcgg agctgcagca ggacgacgcg gccggcgcgg cagacggcca cggctcgagc 60 tgccaaatgc tgttaaatca actgagagaa atcacaggca ttcaggaccc ttcctttctc 120 catgaagctc tgaaggccag taatggtgac attactcagg cagtcagcct tctcactgat 180 gagagagtta aggagcccag tcaagacact gttgctacag aaccatctga agtagagggg 240 agtgctgcca acaaggaagt attagcaaaa gttatagacc ttactcatga taacaaagat 300 gatcttcagg ctgccattgc tttgagtcta ctggagtctc ccaaaattca agctgatgga 360 agagatctta acaggatgca tgaagcaacc tctgcagaaa ctaaacgctc aaagagaaaa 420 cgctgtgaag tctggggaga aaaccccaat cccaatgact ggaggagagt tgatggttgg 480 ccagttgggc tgaaaaatgt tggcaataca tgttggttta gtgctgttat tcagtctctc 540 tttcaattgc ctgaatttcg aagacttgtt ctcagttata gtctgccaca aaatgtactt 600 gaaaattgtc gaagtcatac agaaaagaga aatatcatgt ttatgcaaga gcttcagtat 660 ttgtttgctc taatgatggg atcaaataga aaatttgtag acccgtctgc agccctggat 720 ctattaaagg gagcattccg atcatctgag gaacagcagc aagatgtgag tgaattcaca 780 cacaagctcc tggattggct agaggacgca ttccagctag ctgttaatgt taacagtccc 840 aggaacaaat ctgaaaatcc aatggtgcag ctgttctatg gtactttcct gactgaaggg 900 gttcgtgaag gaaaaccctt ttgtaacaat gagaccttcg gccagtatcc tcttcaggta 960 aacggttatc gcaacttaga cgagtgtttg gaaggggcca tggtggaggg tgatgttgag 1020 cttcttccct ccgatcactc ggtgaagtat ggacaagagc gttggtttac aaagctacct 1080 ccagtgttga cctttgaact ctcaagattt gagtttaatc agtcccttgg gcagccagag 1140 aaaattcaca ataagctgga atttcctcag attatttata tggacaggta catgtacagg 1200 agcaaggagc ttattcgaaa taagagagag tgtattcgaa agttgaagga ggaaataaaa 1260 attctgcagc aaaaattgga aaggtatgtg aaatatggct caggcccagc tcggttcccg 1320 ctcccggaca tgctgaaata tgttattgaa tttgctagta caaaacctgc ctcagaaagc 1380 tgtccacctg aaagtgacac acatatgaca ttaccacttt cttcagtgca ctgctcggtt 1440 tctgaccaga catccaagga aagtacaagt acagaaagct cttctcagga tgttgaaagt 1500 accttttctt ctcctgaaga ttctttaccc aagtctaaac cactgacatc ttctcggtct 1560 tccatggaaa tgccttcaca gccagctcca cgaacagtca cagatgagga gataaatttt 1620 gttaagacct gtcttcagag atggaggagt gagattgaac aagatataca agatttaaag 1680 acttgtattg caagtactac tcagactatt gaacagatgt actgcgatcc tctccttcgt 1740 caggtgcctt atcgcttgca tgcagttctt gttcatgaag gacaagcaaa tgctggacac 1800 tattgggcct atatctataa tcaaccccga cagagctggc tcaagtacaa tgacatctct 1860 gttactgaat cttcctggga agaagttgaa agagattcct atggaggcct gagaaatgtt 1920 agtgcttact gtctgatgta cattaatgac aaactaccct acttcaatgc agaggcagcc 1980 ccaactgaat cagatcaaat gtcagaagtg gaagccctat ctgtggaact caagcattac 2040 attcaggagg ataactggcg gtttgagcag gaagtagagg agtgggaaga agagcagtct 2100 tgcaaaatcc ctcaaatgga gtcctccacc aactcctcat cacaggacta ctctacatca 2160 caagagcctt cagtagcctc ttctcatggg gttcgctgct tgtcgtctga gcatgctgtg 2220 attgtaaagg agcaaactgc ccaggctatt gcaaacacag cccgtgccta tgagaagagc 2280 ggtgtagaag cggcactgag tgaggttaaa gaagctgaac ccaagaagcc catgccccag 2340 gaaacaaacc ttgcagagca gtcagaacag cccccaaagg ctaatgatgc agagtctact 2400 gcccagccta attctgaggt ctctgaagtc gagattccca gtgtgggaag gattctggtt 2460 agatctgatg cagatggata tgatgaggag gtgatgctga gccctgccat gcaaggggtc 2520 atcctggcca tagctaaagc ccgtcagacc tttgaccgag atgggtctga agcagggctg 2580 attaaggcat tccatgaaga atactccagg ctctatcagc ttgccaaaga gacccccacc 2640 tctcacagtg atcctcgact tcagcatgtc cttgtctact ttttccaaaa tgaagcaccc 2700 aaaagggtag tagaacgaac ccttctggaa cagtttgcag ataaaaatct tagctatgat 2760 gaaagatcaa tcagcattat gaaggtggct caagcgaaac tgaaggaaat tggtccagat 2820 gacatgaata tggaagagta caagaagtgg catgaagatt atagtttgtt ccgaaaagtg 2880 tctgtgtatc tcctaacagg cctagaactc tatcaaaaag gaaagtacca agaggcactt 2940 tcctacctgg tatatgccta ccagagcaat gctgccctgc tgatgaaggg gccccgccgg 3000 ggggtcaaag aatccgtgat tgctttatac cgaagaaaat gccttctgga gctgaatgcc 3060 aaagcagctt ctctttttga aacaaatgat gatcactccg taactgaggg cattaatgtg 3120 atgaatgaac tgatcatccc ctgcattcac cttatcatta ataatgacat ttccaaggat 3180 gatctggatg ccattgaggt catgagaaac cattggtgct cttaccttgg gcaagatatt 3240 gcagaaaatc tgcagctgtg cctaggggag tttctaccca gacttctaga tccttctgca 3300 gaaatcatcg tcttgaaaga gcctccaact attcgaccca attctcccta tgacctatgt 3360 agccgatttg cagctgtcat ggagtcaatt cagggagttt caactgtgac agtgaaataa 3420 80 2082 DNA Homo sapiens CDS (115)...(1518) 80 cactagtaac gccgccatgt gctggaattc gcccttctcg ggaagcgcgc cattgtgttg 60 gtacccggga attcgcggcc gcgtcgacgc ccgccggggc tctccagctt cgcc atg 117 Met 1 ccg ccg tgg ggc gcc gcc ctc gcg ctc atc ttg gcc gtg ctc gcc ctt 165 Pro Pro Trp Gly Ala Ala Leu Ala Leu Ile Leu Ala Val Leu Ala Leu 5 10 15 ctc ggc ctg ctc ggc ccg cgg ctc cgg gga ccc tgg ggg cgc gcc gtc 213 Leu Gly Leu Leu Gly Pro Arg Leu Arg Gly Pro Trp Gly Arg Ala Val 20 25 30 gga gag agg acc ctg ccg ggg gcc caa gac cga gac gac ggg gag gag 261 Gly Glu Arg Thr Leu Pro Gly Ala Gln Asp Arg Asp Asp Gly Glu Glu 35 40 45 gcg gac ggc gga ggc ccg gcg gac cag ttc agc gac ggg cgc gag cca 309 Ala Asp Gly Gly Gly Pro Ala Asp Gln Phe Ser Asp Gly Arg Glu Pro 50 55 60 65 ctg ccg gga ggg tgc agc ctt gtt tgc aag ccg tcg gcc ctg gcc cag 357 Leu Pro Gly Gly Cys Ser Leu Val Cys Lys Pro Ser Ala Leu Ala Gln 70 75 80 tgc ctg ctg cgc gcc ctg cgg cgc tca gag gcg ctg gag gcc ggc ccg 405 Cys Leu Leu Arg Ala Leu Arg Arg Ser Glu Ala Leu Glu Ala Gly Pro 85 90 95 cgc tcc tgg ttc tcc ggg ccc cac ctg cag acc ctc tgc cac ttc gtc 453 Arg Ser Trp Phe Ser Gly Pro His Leu Gln Thr Leu Cys His Phe Val 100 105 110 ctg ccc gta gcg cct ggg cct gag ctg gcc cgg gag tac ctg cag ttg 501 Leu Pro Val Ala Pro Gly Pro Glu Leu Ala Arg Glu Tyr Leu Gln Leu 115 120 125 gcg gac gat ggg cta gtg gcc ctg gac tgg gtg gta gga cct tgt gtt 549 Ala Asp Asp Gly Leu Val Ala Leu Asp Trp Val Val Gly Pro Cys Val 130 135 140 145 cgg ggc cgc cgg atc acc agc gcc ggg ggc ctt cct gcg gtg ctt ctg 597 Arg Gly Arg Arg Ile Thr Ser Ala Gly Gly Leu Pro Ala Val Leu Leu 150 155 160 gtg atc ccc aat gcg tgg ggt cgc ctc acc cgc aac gtg ctc ggc ctt 645 Val Ile Pro Asn Ala Trp Gly Arg Leu Thr Arg Asn Val Leu Gly Leu 165 170 175 tgc ttg ctc gcc ctg gag cgc ggc tac tac ccg gtc atc ttc cat cgc 693 Cys Leu Leu Ala Leu Glu Arg Gly Tyr Tyr Pro Val Ile Phe His Arg 180 185 190 cgc ggc cac cac ggt tgc cca ctg gtc agc ccc cgg ctg cag cct ttc 741 Arg Gly His His Gly Cys Pro Leu Val Ser Pro Arg Leu Gln Pro Phe 195 200 205 ggg gac ccg tcc gac ctc aag gag gcg gtc aca tac atc cgc ttc cga 789 Gly Asp Pro Ser Asp Leu Lys Glu Ala Val Thr Tyr Ile Arg Phe Arg 210 215 220 225 cac ccg gcg gcg ccg ctg ttc gcg gtg agc gaa ggc tcg ggc tcg gcg 837 His Pro Ala Ala Pro Leu Phe Ala Val Ser Glu Gly Ser Gly Ser Ala 230 235 240 ctg ctc ctg tcc tac ctg ggc gag tgc ggc tcc tcc agc tac gtg aca 885 Leu Leu Leu Ser Tyr Leu Gly Glu Cys Gly Ser Ser Ser Tyr Val Thr 245 250 255 ggc gcc gcc tgc atc tcg ccc gtg ctg cgc tgc cga gag tgg ttc gag 933 Gly Ala Ala Cys Ile Ser Pro Val Leu Arg Cys Arg Glu Trp Phe Glu 260 265 270 gcc ggc ctg ccc tgg ccc tac gag cgg ggc ttt ctg ctc cac cag aag 981 Ala Gly Leu Pro Trp Pro Tyr Glu Arg Gly Phe Leu Leu His Gln Lys 275 280 285 atc gcc ctc agc agg tat gcc aca gcc ctg gag gac act gtg gac acc 1029 Ile Ala Leu Ser Arg Tyr Ala Thr Ala Leu Glu Asp Thr Val Asp Thr 290 295 300 305 agc aga ctg ttc agg agc cgt tcc ctt cga gag ttt gag gag gct ctc 1077 Ser Arg Leu Phe Arg Ser Arg Ser Leu Arg Glu Phe Glu Glu Ala Leu 310 315 320 ttc tgc cac acc aaa agc ttc ccc atc agc tgg gat gcc tac tgg gac 1125 Phe Cys His Thr Lys Ser Phe Pro Ile Ser Trp Asp Ala Tyr Trp Asp 325 330 335 cgc aac gac ccg ctc cgg gat gtc gat gag gca gcc gtg cct gtg ctg 1173 Arg Asn Asp Pro Leu Arg Asp Val Asp Glu Ala Ala Val Pro Val Leu 340 345 350 tgt atc tgc agt gct gac gac ccc gtg tgt gga ccc cca gac cac act 1221 Cys Ile Cys Ser Ala Asp Asp Pro Val Cys Gly Pro Pro Asp His Thr 355 360 365 ctg aca act gaa ctc ttc cac agc aac ccc tac ttc ttc ctc ctg ctc 1269 Leu Thr Thr Glu Leu Phe His Ser Asn Pro Tyr Phe Phe Leu Leu Leu 370 375 380 385 agt cgc cac gga ggc cac tgt ggc ttc ctg cgc cag gag ccc ttg cca 1317 Ser Arg His Gly Gly His Cys Gly Phe Leu Arg Gln Glu Pro Leu Pro 390 395 400 gcc tgg agc cat gag gtc atc ttg gag tcc ttc cgg gcc ttg act gag 1365 Ala Trp Ser His Glu Val Ile Leu Glu Ser Phe Arg Ala Leu Thr Glu 405 410 415 ttc ttc cga acg gag gag agg att aaa ggg ctg agc agg cac aga gct 1413 Phe Phe Arg Thr Glu Glu Arg Ile Lys Gly Leu Ser Arg His Arg Ala 420 425 430 tcc ttc ctt ggg ggc cgt cgt cgt ggg gga gcc ttg cag agg cgg gaa 1461 Ser Phe Leu Gly Gly Arg Arg Arg Gly Gly Ala Leu Gln Arg Arg Glu 435 440 445 gtc tct tcc tct tcc aac ctg gag gag atc ttt aac tgg aag cga tca 1509 Val Ser Ser Ser Ser Asn Leu Glu Glu Ile Phe Asn Trp Lys Arg Ser 450 455 460 465 tac aca agg tgagagacct ggcctgagaa cccccaagtc ctgcaaagaa 1558 Tyr Thr Arg aaacagagct gggcaagggg gagtcctgga aagatggggc ggactgaaca gagggagctc 1618 cagctctgtg ctcctcattc agtccctctc tcttaaattg gtgccttgaa agagaaggaa 1678 cgtcctgcga gcctgcactc acttcatcct cagcagaact cctgcctggc ctctgctcaa 1738 catatcccta ctcatccggt cagcagcggc gcgttccagt cactgtcacc tgtcactgac 1798 atcacaagcc aaaggatagc actttttcaa tccatggact caggagaaaa tgccctctta 1858 ctggcagtgg ctagagggat gagacgtttg tgtatgtcac tgggcagtga ccccgattct 1918 caagctggag ccatttgatg tcatgaggac aggatgtttg tgtctcggcc ccacttccct 1978 catttgctct gtggttgtgg cgccctgctt tgaccgaatg ctctggcaac tgcggcagca 2038 ggcttgtgtg tgtgagaagg gcggcagagg cagtggggct ggct 2082 81 468 PRT Homo sapiens 81 Met Pro Pro Trp Gly Ala Ala Leu Ala Leu Ile Leu Ala Val Leu Ala 1 5 10 15 Leu Leu Gly Leu Leu Gly Pro Arg Leu Arg Gly Pro Trp Gly Arg Ala 20 25 30 Val Gly Glu Arg Thr Leu Pro Gly Ala Gln Asp Arg Asp Asp Gly Glu 35 40 45 Glu Ala Asp Gly Gly Gly Pro Ala Asp Gln Phe Ser Asp Gly Arg Glu 50 55 60 Pro Leu Pro Gly Gly Cys Ser Leu Val Cys Lys Pro Ser Ala Leu Ala 65 70 75 80 Gln Cys Leu Leu Arg Ala Leu Arg Arg Ser Glu Ala Leu Glu Ala Gly 85 90 95 Pro Arg Ser Trp Phe Ser Gly Pro His Leu Gln Thr Leu Cys His Phe 100 105 110 Val Leu Pro Val Ala Pro Gly Pro Glu Leu Ala Arg Glu Tyr Leu Gln 115 120 125 Leu Ala Asp Asp Gly Leu Val Ala Leu Asp Trp Val Val Gly Pro Cys 130 135 140 Val Arg Gly Arg Arg Ile Thr Ser Ala Gly Gly Leu Pro Ala Val Leu 145 150 155 160 Leu Val Ile Pro Asn Ala Trp Gly Arg Leu Thr Arg Asn Val Leu Gly 165 170 175 Leu Cys Leu Leu Ala Leu Glu Arg Gly Tyr Tyr Pro Val Ile Phe His 180 185 190 Arg Arg Gly His His Gly Cys Pro Leu Val Ser Pro Arg Leu Gln Pro 195 200 205 Phe Gly Asp Pro Ser Asp Leu Lys Glu Ala Val Thr Tyr Ile Arg Phe 210 215 220 Arg His Pro Ala Ala Pro Leu Phe Ala Val Ser Glu Gly Ser Gly Ser 225 230 235 240 Ala Leu Leu Leu Ser Tyr Leu Gly Glu Cys Gly Ser Ser Ser Tyr Val 245 250 255 Thr Gly Ala Ala Cys Ile Ser Pro Val Leu Arg Cys Arg Glu Trp Phe 260 265 270 Glu Ala Gly Leu Pro Trp Pro Tyr Glu Arg Gly Phe Leu Leu His Gln 275 280 285 Lys Ile Ala Leu Ser Arg Tyr Ala Thr Ala Leu Glu Asp Thr Val Asp 290 295 300 Thr Ser Arg Leu Phe Arg Ser Arg Ser Leu Arg Glu Phe Glu Glu Ala 305 310 315 320 Leu Phe Cys His Thr Lys Ser Phe Pro Ile Ser Trp Asp Ala Tyr Trp 325 330 335 Asp Arg Asn Asp Pro Leu Arg Asp Val Asp Glu Ala Ala Val Pro Val 340 345 350 Leu Cys Ile Cys Ser Ala Asp Asp Pro Val Cys Gly Pro Pro Asp His 355 360 365 Thr Leu Thr Thr Glu Leu Phe His Ser Asn Pro Tyr Phe Phe Leu Leu 370 375 380 Leu Ser Arg His Gly Gly His Cys Gly Phe Leu Arg Gln Glu Pro Leu 385 390 395 400 Pro Ala Trp Ser His Glu Val Ile Leu Glu Ser Phe Arg Ala Leu Thr 405 410 415 Glu Phe Phe Arg Thr Glu Glu Arg Ile Lys Gly Leu Ser Arg His Arg 420 425 430 Ala Ser Phe Leu Gly Gly Arg Arg Arg Gly Gly Ala Leu Gln Arg Arg 435 440 445 Glu Val Ser Ser Ser Ser Asn Leu Glu Glu Ile Phe Asn Trp Lys Arg 450 455 460 Ser Tyr Thr Arg 465 82 1407 DNA Homo sapiens 82 atgccgccgt ggggcgccgc cctcgcgctc atcttggccg tgctcgccct tctcggcctg 60 ctcggcccgc ggctccgggg accctggggg cgcgccgtcg gagagaggac cctgccgggg 120 gcccaagacc gagacgacgg ggaggaggcg gacggcggag gcccggcgga ccagttcagc 180 gacgggcgcg agccactgcc gggagggtgc agccttgttt gcaagccgtc ggccctggcc 240 cagtgcctgc tgcgcgccct gcggcgctca gaggcgctgg aggccggccc gcgctcctgg 300 ttctccgggc cccacctgca gaccctctgc cacttcgtcc tgcccgtagc gcctgggcct 360 gagctggccc gggagtacct gcagttggcg gacgatgggc tagtggccct ggactgggtg 420 gtaggacctt gtgttcgggg ccgccggatc accagcgccg ggggccttcc tgcggtgctt 480 ctggtgatcc ccaatgcgtg gggtcgcctc acccgcaacg tgctcggcct ttgcttgctc 540 gccctggagc gcggctacta cccggtcatc ttccatcgcc gcggccacca cggttgccca 600 ctggtcagcc cccggctgca gcctttcggg gacccgtccg acctcaagga ggcggtcaca 660 tacatccgct tccgacaccc ggcggcgccg ctgttcgcgg tgagcgaagg ctcgggctcg 720 gcgctgctcc tgtcctacct gggcgagtgc ggctcctcca gctacgtgac aggcgccgcc 780 tgcatctcgc ccgtgctgcg ctgccgagag tggttcgagg ccggcctgcc ctggccctac 840 gagcggggct ttctgctcca ccagaagatc gccctcagca ggtatgccac agccctggag 900 gacactgtgg acaccagcag actgttcagg agccgttccc ttcgagagtt tgaggaggct 960 ctcttctgcc acaccaaaag cttccccatc agctgggatg cctactggga ccgcaacgac 1020 ccgctccggg atgtcgatga ggcagccgtg cctgtgctgt gtatctgcag tgctgacgac 1080 cccgtgtgtg gacccccaga ccacactctg acaactgaac tcttccacag caacccctac 1140 ttcttcctcc tgctcagtcg ccacggaggc cactgtggct tcctgcgcca ggagcccttg 1200 ccagcctgga gccatgaggt catcttggag tccttccggg ccttgactga gttcttccga 1260 acggaggaga ggattaaagg gctgagcagg cacagagctt ccttccttgg gggccgtcgt 1320 cgtgggggag ccttgcagag gcgggaagtc tcttcctctt ccaacctgga ggagatcttt 1380 aactggaagc gatcatacac aaggtga 1407 83 32 PRT Artificial Sequence consensus sequence 83 Thr Gly Leu Ile Asn Leu Gly Asn Thr Cys Tyr Met Asn Ser Val Leu 1 5 10 15 Gln Cys Leu Phe Ser Ile Pro Pro Leu Arg Asp Tyr Leu Leu Asp Ile 20 25 30 84 69 PRT Artificial Sequence consensus sequence 84 Gly Pro Gly Lys Tyr Glu Leu Tyr Ala Val Val Val His Ser Gly Ser 1 5 10 15 Ser Leu Ser Gly Gly His Tyr Thr Ala Tyr Val Lys Lys Glu Asn Trp 20 25 30 Tyr Lys Phe Asp Asp Asp Lys Val Ser Arg Val Thr Glu Glu Glu Val 35 40 45 Leu Lys Glu Ser Gly Gly Glu Ser Gly Asp Thr Ser Ser Ala Tyr Ile 50 55 60 Leu Phe Tyr Glu Arg 65 85 41 PRT Artificial Sequence consensus sequence 85 Glu Asp Glu Glu Lys Ile Glu Gln Leu Val Glu Met Gly Phe Asp Arg 1 5 10 15 Glu Glu Val Val Lys Ala Leu Arg Ala Thr Asn Gly Asn Gly Val Glu 20 25 30 Arg Ala Ala Glu Trp Leu Leu Ser His 35 40 86 18 PRT Artificial Sequence consensus sequence 86 Glu Ser Glu Glu Glu Asp Leu Gln Leu Ala Leu Ala Leu Ser Leu Glu 1 5 10 15 Glu Ala 87 232 PRT Artificial Sequence consensus sequence 87 Phe Arg Val Ile Leu Leu Asp Leu Arg Gly Phe Gly Glu Ser Ser Pro 1 5 10 15 Ser Asp Leu Ala Glu Tyr Arg Phe Asp Asp Leu Ala Glu Asp Leu Glu 20 25 30 Ala Leu Leu Asp Ala Leu Gly Leu Glu Lys Pro Val Ile Leu Val Gly 35 40 45 His Ser Met Gly Gly Ala Ile Ala Leu Ala Tyr Ala Ala Lys Tyr Pro 50 55 60 Glu Leu Arg Val Lys Ala Leu Val Leu Val Ser Pro Pro Leu Pro Ala 65 70 75 80 Gly Leu Ser Ser Asp Leu Phe Pro Arg Gln Gly Asn Leu Glu Gly Leu 85 90 95 Leu Leu Ala Asn Phe Arg Asn Arg Leu Ser Arg Ser Val Glu Ala Leu 100 105 110 Leu Gly Arg Ala Leu Lys Gln Phe Phe Leu Leu Gly Arg Pro Leu Val 115 120 125 Ser Asp Phe Leu Lys Gln Ala Glu Asp Trp Leu Ser Ser Leu Ile Arg 130 135 140 Gln Gly Glu Asp Asp Gly Gly Asp Gly Leu Leu Gly Ala Ala Val Ala 145 150 155 160 Leu Gly Lys Leu Leu Gln Trp Asp Leu Ser Ala Leu Lys Asp Ile Lys 165 170 175 Val Pro Thr Leu Val Ile Trp Gly Thr Asp Asp Pro Leu Val Pro Leu 180 185 190 Asp Ala Ser Glu Lys Leu Ser Ala Leu Ile Pro Asn Ala Glu Val Val 195 200 205 Val Ile Asp Asp Ala Gly His Leu Ala Leu Leu Glu Lys Pro Glu Glu 210 215 220 Val Ala Glu Leu Ile Lys Phe Leu 225 230 88 341 PRT Artificial Sequence consensus sequence 88 Pro Asn Phe Trp Leu Phe Asn Gly His Val Gln Thr Ile Trp Ala Ser 1 5 10 15 Phe Phe Arg Arg Lys Arg Cys Pro Thr Val Tyr Tyr Arg Arg Glu Ile 20 25 30 Leu Glu Leu Lys Asp Gly Gly Thr Val Thr Leu Asp Trp Met Glu Pro 35 40 45 Glu Gly Glu Asp Gln Asp Phe Asn Ser Asp Pro Asp Ser Pro Leu Val 50 55 60 Val Ile Leu His Gly Leu Thr Gly Gly Ser His Glu Pro Tyr Ile Arg 65 70 75 80 His Leu Val His Glu Leu Ala Arg Lys Arg Gly Trp Arg Cys Val Val 85 90 95 Leu Asn His Arg Gly Cys Gly Gly Ser Pro Ile Thr Thr Pro Arg Leu 100 105 110 Tyr Thr Ala Gly His Thr Glu Asp Ile Arg Glu Val Ile Glu His Leu 115 120 125 Lys Gln Arg Tyr Pro Glu Ala Pro Leu Tyr Ala Val Gly Phe Ser Leu 130 135 140 Gly Gly Asn Met Leu Thr Asn Tyr Leu Gly Glu Glu Gly Asp Asn Cys 145 150 155 160 Pro Leu Ser Ala Ala Val Thr Ile Cys Asn Pro Trp Asp Leu Glu Glu 165 170 175 Cys Ser Glu Ser Ile Glu Lys Gly Leu Met Ser Arg Arg Leu Tyr Asn 180 185 190 Arg Tyr Leu Thr Lys Asn Leu Lys Arg Met Val Gln Arg His Arg Asn 195 200 205 His Phe Glu Asp Ile Glu Lys Lys Ala Glu Tyr Asn Ala Glu Glu Ile 210 215 220 Asp Leu Glu Arg Leu Lys Lys Ala Arg Thr Ile Arg Glu Phe Asp Asp 225 230 235 240 Asn Ile Thr Ala Pro Met Tyr Gly Phe Lys Asp Ala Glu Asp Tyr Tyr 245 250 255 Arg Gln Ala Ser Ser Met Pro Tyr Leu Asp Asn Ile Arg Val Pro Leu 260 265 270 Leu Cys Ile Asn Ala Ala Asp Asp Pro Phe Met Pro Glu Glu Ala Ile 275 280 285 Pro Pro Asp Glu Ala Lys Gln Asn Pro Asn Val Val Leu Val Ile Thr 290 295 300 Ser His Gly Gly His Ile Gly Phe Ile Glu Gly Thr Trp Tyr Pro Ser 305 310 315 320 Gly Ser Gln Trp Leu Asp Gln Thr Ile Met Glu Tyr Leu Glu Ser Phe 325 330 335 Arg Thr Asn Arg Arg 340 89 19 PRT Artificial Sequence signature pattern 89 Tyr Xaa Leu Xaa Xaa Xaa Xaa Xaa His Xaa Gly Xaa Xaa Xaa Xaa Xaa 1 5 10 15 Gly His Tyr 90 16 PRT Homo sapiens 90 Gly Leu Thr Asn Leu Gly Ala Thr Cys Tyr Leu Ala Ser Thr Ile Gln 1 5 10 15 91 16 PRT Homo sapiens 91 Gly Leu Lys Asn Val Gly Asn Thr Cys Trp Phe Ser Ala Val Ile Gln 1 5 10 15 92 18 PRT Homo sapiens 92 Tyr Asp Leu Ile Gly Val Thr Val His Thr Gly Thr Ala Asp Gly Gly 1 5 10 15 His Tyr 93 18 PRT Homo sapiens 93 Tyr Arg Leu His Ala Val Leu Val His Glu Gly Gln Ala Asn Ala Gly 1 5 10 15 His Tyr 94 3954 DNA Homo sapiens CDS (1)...(3951) 94 atg gtg gct gat gcc tgt aat ccc aac agt ttg gga gac tgg gga gga 48 Met Val Ala Asp Ala Cys Asn Pro Asn Ser Leu Gly Asp Trp Gly Gly 1 5 10 15 aga tca ttt gag gcc agg agt ttg aga cca gcc tgg gct caa gca gtc 96 Arg Ser Phe Glu Ala Arg Ser Leu Arg Pro Ala Trp Ala Gln Ala Val 20 25 30 ctg cct cag cct ccc aaa gtg ctg gga tta cag atg ggt cat ctt act 144 Leu Pro Gln Pro Pro Lys Val Leu Gly Leu Gln Met Gly His Leu Thr 35 40 45 ctg gaa gac tat cag atc tgg agt gtg aaa aat gtt ctt gcc aat gag 192 Leu Glu Asp Tyr Gln Ile Trp Ser Val Lys Asn Val Leu Ala Asn Glu 50 55 60 ttt ttg aac ctc ctt ttc cag gtg tgt cac ata gtt ctg ggg tta aga 240 Phe Leu Asn Leu Leu Phe Gln Val Cys His Ile Val Leu Gly Leu Arg 65 70 75 80 cca gct act ccg gaa gaa gaa gga caa att att aga gga tgg tta gaa 288 Pro Ala Thr Pro Glu Glu Glu Gly Gln Ile Ile Arg Gly Trp Leu Glu 85 90 95 cga gag agc agg tat ggt ctg caa gca gga cac aac tgg ttt atc atc 336 Arg Glu Ser Arg Tyr Gly Leu Gln Ala Gly His Asn Trp Phe Ile Ile 100 105 110 tcc atg cag tgg tgg caa cag tgg aaa gaa tat gtc aaa tac gat gcc 384 Ser Met Gln Trp Trp Gln Gln Trp Lys Glu Tyr Val Lys Tyr Asp Ala 115 120 125 aac cct gtg gta att gag cca tca tct gtt ttg aat gga gga aaa tac 432 Asn Pro Val Val Ile Glu Pro Ser Ser Val Leu Asn Gly Gly Lys Tyr 130 135 140 tca ttt gga act gca gcc cat cct atg gag cag gtc gaa gat aga att 480 Ser Phe Gly Thr Ala Ala His Pro Met Glu Gln Val Glu Asp Arg Ile 145 150 155 160 gga agc agc ctc agt tac gtg aat act aca gaa gag aaa ttt tca gac 528 Gly Ser Ser Leu Ser Tyr Val Asn Thr Thr Glu Glu Lys Phe Ser Asp 165 170 175 aac att tct act gca tct gaa gcc tca gaa act gct ggc agc ggc ttt 576 Asn Ile Ser Thr Ala Ser Glu Ala Ser Glu Thr Ala Gly Ser Gly Phe 180 185 190 ctg tat tct gcc aca cca ggg gca gat gtt tgc ttt gct cga caa cat 624 Leu Tyr Ser Ala Thr Pro Gly Ala Asp Val Cys Phe Ala Arg Gln His 195 200 205 aac act tct gac aat aac aac cag tgt ttg ctg gga gcc aat ggg aat 672 Asn Thr Ser Asp Asn Asn Asn Gln Cys Leu Leu Gly Ala Asn Gly Asn 210 215 220 att ttg ttg cac ctt aac cct cag aaa cca ggg gct att gat aat cag 720 Ile Leu Leu His Leu Asn Pro Gln Lys Pro Gly Ala Ile Asp Asn Gln 225 230 235 240 cca tta gta act caa gaa cca gta aag gct aca tca tta aca cta gaa 768 Pro Leu Val Thr Gln Glu Pro Val Lys Ala Thr Ser Leu Thr Leu Glu 245 250 255 gga gga cga tta aaa cga act cca cag ctg att cat gga aga gac tat 816 Gly Gly Arg Leu Lys Arg Thr Pro Gln Leu Ile His Gly Arg Asp Tyr 260 265 270 gaa atg gtc cca gaa cct gtg tgg aga gca ctt tat cac tgg tat gga 864 Glu Met Val Pro Glu Pro Val Trp Arg Ala Leu Tyr His Trp Tyr Gly 275 280 285 gca aac ctg gcc tta cct aga cca gtt atc aag aac agc aag aca gac 912 Ala Asn Leu Ala Leu Pro Arg Pro Val Ile Lys Asn Ser Lys Thr Asp 290 295 300 atc cca gag ctg gaa tta ttt ccc cgc tat ctt ctc ttc ctg aga cag 960 Ile Pro Glu Leu Glu Leu Phe Pro Arg Tyr Leu Leu Phe Leu Arg Gln 305 310 315 320 cag cct gcc act cgg aca cag cag tct aac atc tgg gtg aat atg gga 1008 Gln Pro Ala Thr Arg Thr Gln Gln Ser Asn Ile Trp Val Asn Met Gly 325 330 335 aat gta cct tct ccg aat gca cct tta aag cgg gta tta gcc tat aca 1056 Asn Val Pro Ser Pro Asn Ala Pro Leu Lys Arg Val Leu Ala Tyr Thr 340 345 350 ggc tgt ttt agt cga atg cag acc atc aag gaa att cac gaa tat cta 1104 Gly Cys Phe Ser Arg Met Gln Thr Ile Lys Glu Ile His Glu Tyr Leu 355 360 365 tct caa aga ctg cgc att aaa gag gaa gat atg cgc ctg tgg cta tac 1152 Ser Gln Arg Leu Arg Ile Lys Glu Glu Asp Met Arg Leu Trp Leu Tyr 370 375 380 aac agt gag aac tac ctt act ctt ctg gat gat gag gat cat aaa ttg 1200 Asn Ser Glu Asn Tyr Leu Thr Leu Leu Asp Asp Glu Asp His Lys Leu 385 390 395 400 gaa tat ttg aaa atc cag gat gaa caa cac ctg gta att gaa gtt cgc 1248 Glu Tyr Leu Lys Ile Gln Asp Glu Gln His Leu Val Ile Glu Val Arg 405 410 415 aac aaa gat atg agt tgg cct gag gag atg tct ttt ata gca aat agt 1296 Asn Lys Asp Met Ser Trp Pro Glu Glu Met Ser Phe Ile Ala Asn Ser 420 425 430 agt aaa ata gat aga cac aag gtt ccc aca gaa aag gga gcc aca ggt 1344 Ser Lys Ile Asp Arg His Lys Val Pro Thr Glu Lys Gly Ala Thr Gly 435 440 445 cta agc aat ctg gga aac aca tgc ttc atg aac tca agc atc cag tgt 1392 Leu Ser Asn Leu Gly Asn Thr Cys Phe Met Asn Ser Ser Ile Gln Cys 450 455 460 gtt agt aac aca cag cca ctg aca cag tat ttt atc tca ggg aga cat 1440 Val Ser Asn Thr Gln Pro Leu Thr Gln Tyr Phe Ile Ser Gly Arg His 465 470 475 480 ctt tat gaa ctc aac agg aca aat ccc att ggt atg aag ggg cat atg 1488 Leu Tyr Glu Leu Asn Arg Thr Asn Pro Ile Gly Met Lys Gly His Met 485 490 495 gct aaa tgc tat ggt gat tta gtg cag gaa ctt tgg agt gga act cag 1536 Ala Lys Cys Tyr Gly Asp Leu Val Gln Glu Leu Trp Ser Gly Thr Gln 500 505 510 aag aat gtt gcc cca tta aag ctt cgg tgg acc ata gca aaa tat gct 1584 Lys Asn Val Ala Pro Leu Lys Leu Arg Trp Thr Ile Ala Lys Tyr Ala 515 520 525 ccc agg ttt aat ggg ttt cag caa cag gac tcc caa gaa ctt ctg gct 1632 Pro Arg Phe Asn Gly Phe Gln Gln Gln Asp Ser Gln Glu Leu Leu Ala 530 535 540 ttt ctc ttg gat ggt ctt cat gaa gat ctt aat cga gtc cat gaa aag 1680 Phe Leu Leu Asp Gly Leu His Glu Asp Leu Asn Arg Val His Glu Lys 545 550 555 560 cca tat gtg gaa ctg aag gac agt gat ggg cga cca gac tgg gaa gta 1728 Pro Tyr Val Glu Leu Lys Asp Ser Asp Gly Arg Pro Asp Trp Glu Val 565 570 575 gct gca gag gcc tgg gac aac cat cta aga aga aat aga tca att gtt 1776 Ala Ala Glu Ala Trp Asp Asn His Leu Arg Arg Asn Arg Ser Ile Val 580 585 590 gtg gat ttg ttc cat ggg cag cta aga tct caa gta aaa tgc aag aca 1824 Val Asp Leu Phe His Gly Gln Leu Arg Ser Gln Val Lys Cys Lys Thr 595 600 605 tgt ggg cat ata agt gtc cga ttt gac cct ttc aat ttt ttg tct ttg 1872 Cys Gly His Ile Ser Val Arg Phe Asp Pro Phe Asn Phe Leu Ser Leu 610 615 620 cca cta cca atg gac agt tat atg cac tta gaa ata aca gtg att aag 1920 Pro Leu Pro Met Asp Ser Tyr Met His Leu Glu Ile Thr Val Ile Lys 625 630 635 640 tta gat ggt act acc cct gta cgg tat gga cta aga ctg aat atg gat 1968 Leu Asp Gly Thr Thr Pro Val Arg Tyr Gly Leu Arg Leu Asn Met Asp 645 650 655 gaa aag tac aca ggt tta aaa aaa cag ctg agt gat ctc tgt gga ctt 2016 Glu Lys Tyr Thr Gly Leu Lys Lys Gln Leu Ser Asp Leu Cys Gly Leu 660 665 670 aat tca gaa caa atc ctt cta gca gaa gta cat ggt tcc aac ata aag 2064 Asn Ser Glu Gln Ile Leu Leu Ala Glu Val His Gly Ser Asn Ile Lys 675 680 685 aac ttt cct cag gac aac caa aaa gta cga ctc tca gtg agt gga ttt 2112 Asn Phe Pro Gln Asp Asn Gln Lys Val Arg Leu Ser Val Ser Gly Phe 690 695 700 ttg tgt gca ttt gaa att cct gtc cct gtg tct cca att tca gct tct 2160 Leu Cys Ala Phe Glu Ile Pro Val Pro Val Ser Pro Ile Ser Ala Ser 705 710 715 720 agt cca aca cag aca gat ttc tcc tct tcg cca tct aca aat gaa atg 2208 Ser Pro Thr Gln Thr Asp Phe Ser Ser Ser Pro Ser Thr Asn Glu Met 725 730 735 ttc acc cta act acc aat ggg gac cta ccc cga cca ata ttc atc ccc 2256 Phe Thr Leu Thr Thr Asn Gly Asp Leu Pro Arg Pro Ile Phe Ile Pro 740 745 750 aat gga atg cca aac act gtt gtg cca tgt gga act gag aag aac ttc 2304 Asn Gly Met Pro Asn Thr Val Val Pro Cys Gly Thr Glu Lys Asn Phe 755 760 765 aca aat gga atg gtt aat ggt cac atg cca tct ctt cct gac agc ccc 2352 Thr Asn Gly Met Val Asn Gly His Met Pro Ser Leu Pro Asp Ser Pro 770 775 780 ttt aca ggt tac atc att gca gtc cac cga aaa atg atg agg aca gaa 2400 Phe Thr Gly Tyr Ile Ile Ala Val His Arg Lys Met Met Arg Thr Glu 785 790 795 800 ctg tat ttc ctg tca tct cag aag aat cgc ccc agc ctc ttt gga atg 2448 Leu Tyr Phe Leu Ser Ser Gln Lys Asn Arg Pro Ser Leu Phe Gly Met 805 810 815 cca ttg att gtt cca tgt act gtg cat acc cgg aag aaa gac cta tat 2496 Pro Leu Ile Val Pro Cys Thr Val His Thr Arg Lys Lys Asp Leu Tyr 820 825 830 gat gcg gtt tgg att caa gta tcc cgg tta gcg agc cca ctc cca cct 2544 Asp Ala Val Trp Ile Gln Val Ser Arg Leu Ala Ser Pro Leu Pro Pro 835 840 845 cag gaa gct agt aat cat gcc cag gat tgt gac gac agt atg ggc tat 2592 Gln Glu Ala Ser Asn His Ala Gln Asp Cys Asp Asp Ser Met Gly Tyr 850 855 860 caa tat cca ttc act cta cga gtt gtg cag aaa gat ggg aac tcc tgt 2640 Gln Tyr Pro Phe Thr Leu Arg Val Val Gln Lys Asp Gly Asn Ser Cys 865 870 875 880 gct tgg tgc cca tgg tat aga ttt tgc aga ggc tgt aaa att gat tgt 2688 Ala Trp Cys Pro Trp Tyr Arg Phe Cys Arg Gly Cys Lys Ile Asp Cys 885 890 895 ggg gaa gac aga gct ttc att gga aat gcc tat atc gct gtg gat tgg 2736 Gly Glu Asp Arg Ala Phe Ile Gly Asn Ala Tyr Ile Ala Val Asp Trp 900 905 910 gat ccc aca gcc ctt cac ctt cgc tat caa aca tcc cag gaa agg gtt 2784 Asp Pro Thr Ala Leu His Leu Arg Tyr Gln Thr Ser Gln Glu Arg Val 915 920 925 gta gat gag cat gag agt gtg gag cag agt cgg cga gcg caa gcc gag 2832 Val Asp Glu His Glu Ser Val Glu Gln Ser Arg Arg Ala Gln Ala Glu 930 935 940 ccc atc aac ctg gac agc tgt ctc cgt gct ttc acc agt gag gaa gag 2880 Pro Ile Asn Leu Asp Ser Cys Leu Arg Ala Phe Thr Ser Glu Glu Glu 945 950 955 960 cta ggg gaa aat gag atg tac tac tgt tcc aag tgt aag acc cac tgc 2928 Leu Gly Glu Asn Glu Met Tyr Tyr Cys Ser Lys Cys Lys Thr His Cys 965 970 975 tta gca aca aag aag ctg gat ctc tgg agg ctt cca ccc atc ctg att 2976 Leu Ala Thr Lys Lys Leu Asp Leu Trp Arg Leu Pro Pro Ile Leu Ile 980 985 990 att cac ctt aag cga ttt caa ttt gta aat ggt cgg tgg ata aaa tca 3024 Ile His Leu Lys Arg Phe Gln Phe Val Asn Gly Arg Trp Ile Lys Ser 995 1000 1005 cag aaa att gtc aaa ttt cct cgg gaa agt ttt gat cca agt gct ttt 3072 Gln Lys Ile Val Lys Phe Pro Arg Glu Ser Phe Asp Pro Ser Ala Phe 1010 1015 1020 ttg gta cca aga gac ccg gct ctc tgc cag cat aaa cca ctc aca ccc 3120 Leu Val Pro Arg Asp Pro Ala Leu Cys Gln His Lys Pro Leu Thr Pro 1025 1030 1035 1040 cag ggg gat gag ctc tct gag ccc agg att ctg gca agg gag gtg aag 3168 Gln Gly Asp Glu Leu Ser Glu Pro Arg Ile Leu Ala Arg Glu Val Lys 1045 1050 1055 aaa gtg gat gcg cag agt tcg gct ggg gaa gag gac gtg ctc ctg agc 3216 Lys Val Asp Ala Gln Ser Ser Ala Gly Glu Glu Asp Val Leu Leu Ser 1060 1065 1070 aaa agc cca tcc tca ctc agc gct aac atc atc agc agc ccg aaa ggt 3264 Lys Ser Pro Ser Ser Leu Ser Ala Asn Ile Ile Ser Ser Pro Lys Gly 1075 1080 1085 tct cct tct tca tca aga aaa agt gga acc agc tgt ccc tcc agc aaa 3312 Ser Pro Ser Ser Ser Arg Lys Ser Gly Thr Ser Cys Pro Ser Ser Lys 1090 1095 1100 aac agc agc cct aat agc agc cca cgg act ttg ggg agg agc aaa ggg 3360 Asn Ser Ser Pro Asn Ser Ser Pro Arg Thr Leu Gly Arg Ser Lys Gly 1105 1110 1115 1120 agg ctc cgg ctg ccc cag att ggc agc aaa aat aaa ctg tca agt agt 3408 Arg Leu Arg Leu Pro Gln Ile Gly Ser Lys Asn Lys Leu Ser Ser Ser 1125 1130 1135 aaa gag aac ttg gat gcc agc aaa gaa aat ggg gct ggg cag ata tgt 3456 Lys Glu Asn Leu Asp Ala Ser Lys Glu Asn Gly Ala Gly Gln Ile Cys 1140 1145 1150 gag ctg gct gac gcc ttg agt cga ggg cat gtg ctg ggg ggc agc caa 3504 Glu Leu Ala Asp Ala Leu Ser Arg Gly His Val Leu Gly Gly Ser Gln 1155 1160 1165 cca gag ttg gtc act cct cag gac cat gag gta gct ttg gcc aat gga 3552 Pro Glu Leu Val Thr Pro Gln Asp His Glu Val Ala Leu Ala Asn Gly 1170 1175 1180 ttc ctt tat gag cat gaa gca tgt ggc aat ggc tac agc aat ggt cag 3600 Phe Leu Tyr Glu His Glu Ala Cys Gly Asn Gly Tyr Ser Asn Gly Gln 1185 1190 1195 1200 ctt gga aac cac agt gaa gaa gac agc act gat gac caa aga gaa gat 3648 Leu Gly Asn His Ser Glu Glu Asp Ser Thr Asp Asp Gln Arg Glu Asp 1205 1210 1215 act cgt att aag cct att tat aat cta tat gca att tcg tgc cat tca 3696 Thr Arg Ile Lys Pro Ile Tyr Asn Leu Tyr Ala Ile Ser Cys His Ser 1220 1225 1230 gga att ctg ggt ggg ggc cat tac gtc act tat gcc aaa aac cca aac 3744 Gly Ile Leu Gly Gly Gly His Tyr Val Thr Tyr Ala Lys Asn Pro Asn 1235 1240 1245 tgc aag tgg tac tgt tac aat gac agc agc tgt aag gaa ctt cac ccg 3792 Cys Lys Trp Tyr Cys Tyr Asn Asp Ser Ser Cys Lys Glu Leu His Pro 1250 1255 1260 gat gaa att gac acc gac tct gcc tac att ctt ttc tat gag cag cag 3840 Asp Glu Ile Asp Thr Asp Ser Ala Tyr Ile Leu Phe Tyr Glu Gln Gln 1265 1270 1275 1280 ggg ata gac tat gca caa ttt ctg cca aag act gat ggc aaa aag atg 3888 Gly Ile Asp Tyr Ala Gln Phe Leu Pro Lys Thr Asp Gly Lys Lys Met 1285 1290 1295 gca gac aca agc agt atg gat gaa gac ttt gag tct gat tac aaa aag 3936 Ala Asp Thr Ser Ser Met Asp Glu Asp Phe Glu Ser Asp Tyr Lys Lys 1300 1305 1310 tac tgt gtg tta cag taa 3954 Tyr Cys Val Leu Gln 1315 95 1317 PRT Homo sapiens 95 Met Val Ala Asp Ala Cys Asn Pro Asn Ser Leu Gly Asp Trp Gly Gly 1 5 10 15 Arg Ser Phe Glu Ala Arg Ser Leu Arg Pro Ala Trp Ala Gln Ala Val 20 25 30 Leu Pro Gln Pro Pro Lys Val Leu Gly Leu Gln Met Gly His Leu Thr 35 40 45 Leu Glu Asp Tyr Gln Ile Trp Ser Val Lys Asn Val Leu Ala Asn Glu 50 55 60 Phe Leu Asn Leu Leu Phe Gln Val Cys His Ile Val Leu Gly Leu Arg 65 70 75 80 Pro Ala Thr Pro Glu Glu Glu Gly Gln Ile Ile Arg Gly Trp Leu Glu 85 90 95 Arg Glu Ser Arg Tyr Gly Leu Gln Ala Gly His Asn Trp Phe Ile Ile 100 105 110 Ser Met Gln Trp Trp Gln Gln Trp Lys Glu Tyr Val Lys Tyr Asp Ala 115 120 125 Asn Pro Val Val Ile Glu Pro Ser Ser Val Leu Asn Gly Gly Lys Tyr 130 135 140 Ser Phe Gly Thr Ala Ala His Pro Met Glu Gln Val Glu Asp Arg Ile 145 150 155 160 Gly Ser Ser Leu Ser Tyr Val Asn Thr Thr Glu Glu Lys Phe Ser Asp 165 170 175 Asn Ile Ser Thr Ala Ser Glu Ala Ser Glu Thr Ala Gly Ser Gly Phe 180 185 190 Leu Tyr Ser Ala Thr Pro Gly Ala Asp Val Cys Phe Ala Arg Gln His 195 200 205 Asn Thr Ser Asp Asn Asn Asn Gln Cys Leu Leu Gly Ala Asn Gly Asn 210 215 220 Ile Leu Leu His Leu Asn Pro Gln Lys Pro Gly Ala Ile Asp Asn Gln 225 230 235 240 Pro Leu Val Thr Gln Glu Pro Val Lys Ala Thr Ser Leu Thr Leu Glu 245 250 255 Gly Gly Arg Leu Lys Arg Thr Pro Gln Leu Ile His Gly Arg Asp Tyr 260 265 270 Glu Met Val Pro Glu Pro Val Trp Arg Ala Leu Tyr His Trp Tyr Gly 275 280 285 Ala Asn Leu Ala Leu Pro Arg Pro Val Ile Lys Asn Ser Lys Thr Asp 290 295 300 Ile Pro Glu Leu Glu Leu Phe Pro Arg Tyr Leu Leu Phe Leu Arg Gln 305 310 315 320 Gln Pro Ala Thr Arg Thr Gln Gln Ser Asn Ile Trp Val Asn Met Gly 325 330 335 Asn Val Pro Ser Pro Asn Ala Pro Leu Lys Arg Val Leu Ala Tyr Thr 340 345 350 Gly Cys Phe Ser Arg Met Gln Thr Ile Lys Glu Ile His Glu Tyr Leu 355 360 365 Ser Gln Arg Leu Arg Ile Lys Glu Glu Asp Met Arg Leu Trp Leu Tyr 370 375 380 Asn Ser Glu Asn Tyr Leu Thr Leu Leu Asp Asp Glu Asp His Lys Leu 385 390 395 400 Glu Tyr Leu Lys Ile Gln Asp Glu Gln His Leu Val Ile Glu Val Arg 405 410 415 Asn Lys Asp Met Ser Trp Pro Glu Glu Met Ser Phe Ile Ala Asn Ser 420 425 430 Ser Lys Ile Asp Arg His Lys Val Pro Thr Glu Lys Gly Ala Thr Gly 435 440 445 Leu Ser Asn Leu Gly Asn Thr Cys Phe Met Asn Ser Ser Ile Gln Cys 450 455 460 Val Ser Asn Thr Gln Pro Leu Thr Gln Tyr Phe Ile Ser Gly Arg His 465 470 475 480 Leu Tyr Glu Leu Asn Arg Thr Asn Pro Ile Gly Met Lys Gly His Met 485 490 495 Ala Lys Cys Tyr Gly Asp Leu Val Gln Glu Leu Trp Ser Gly Thr Gln 500 505 510 Lys Asn Val Ala Pro Leu Lys Leu Arg Trp Thr Ile Ala Lys Tyr Ala 515 520 525 Pro Arg Phe Asn Gly Phe Gln Gln Gln Asp Ser Gln Glu Leu Leu Ala 530 535 540 Phe Leu Leu Asp Gly Leu His Glu Asp Leu Asn Arg Val His Glu Lys 545 550 555 560 Pro Tyr Val Glu Leu Lys Asp Ser Asp Gly Arg Pro Asp Trp Glu Val 565 570 575 Ala Ala Glu Ala Trp Asp Asn His Leu Arg Arg Asn Arg Ser Ile Val 580 585 590 Val Asp Leu Phe His Gly Gln Leu Arg Ser Gln Val Lys Cys Lys Thr 595 600 605 Cys Gly His Ile Ser Val Arg Phe Asp Pro Phe Asn Phe Leu Ser Leu 610 615 620 Pro Leu Pro Met Asp Ser Tyr Met His Leu Glu Ile Thr Val Ile Lys 625 630 635 640 Leu Asp Gly Thr Thr Pro Val Arg Tyr Gly Leu Arg Leu Asn Met Asp 645 650 655 Glu Lys Tyr Thr Gly Leu Lys Lys Gln Leu Ser Asp Leu Cys Gly Leu 660 665 670 Asn Ser Glu Gln Ile Leu Leu Ala Glu Val His Gly Ser Asn Ile Lys 675 680 685 Asn Phe Pro Gln Asp Asn Gln Lys Val Arg Leu Ser Val Ser Gly Phe 690 695 700 Leu Cys Ala Phe Glu Ile Pro Val Pro Val Ser Pro Ile Ser Ala Ser 705 710 715 720 Ser Pro Thr Gln Thr Asp Phe Ser Ser Ser Pro Ser Thr Asn Glu Met 725 730 735 Phe Thr Leu Thr Thr Asn Gly Asp Leu Pro Arg Pro Ile Phe Ile Pro 740 745 750 Asn Gly Met Pro Asn Thr Val Val Pro Cys Gly Thr Glu Lys Asn Phe 755 760 765 Thr Asn Gly Met Val Asn Gly His Met Pro Ser Leu Pro Asp Ser Pro 770 775 780 Phe Thr Gly Tyr Ile Ile Ala Val His Arg Lys Met Met Arg Thr Glu 785 790 795 800 Leu Tyr Phe Leu Ser Ser Gln Lys Asn Arg Pro Ser Leu Phe Gly Met 805 810 815 Pro Leu Ile Val Pro Cys Thr Val His Thr Arg Lys Lys Asp Leu Tyr 820 825 830 Asp Ala Val Trp Ile Gln Val Ser Arg Leu Ala Ser Pro Leu Pro Pro 835 840 845 Gln Glu Ala Ser Asn His Ala Gln Asp Cys Asp Asp Ser Met Gly Tyr 850 855 860 Gln Tyr Pro Phe Thr Leu Arg Val Val Gln Lys Asp Gly Asn Ser Cys 865 870 875 880 Ala Trp Cys Pro Trp Tyr Arg Phe Cys Arg Gly Cys Lys Ile Asp Cys 885 890 895 Gly Glu Asp Arg Ala Phe Ile Gly Asn Ala Tyr Ile Ala Val Asp Trp 900 905 910 Asp Pro Thr Ala Leu His Leu Arg Tyr Gln Thr Ser Gln Glu Arg Val 915 920 925 Val Asp Glu His Glu Ser Val Glu Gln Ser Arg Arg Ala Gln Ala Glu 930 935 940 Pro Ile Asn Leu Asp Ser Cys Leu Arg Ala Phe Thr Ser Glu Glu Glu 945 950 955 960 Leu Gly Glu Asn Glu Met Tyr Tyr Cys Ser Lys Cys Lys Thr His Cys 965 970 975 Leu Ala Thr Lys Lys Leu Asp Leu Trp Arg Leu Pro Pro Ile Leu Ile 980 985 990 Ile His Leu Lys Arg Phe Gln Phe Val Asn Gly Arg Trp Ile Lys Ser 995 1000 1005 Gln Lys Ile Val Lys Phe Pro Arg Glu Ser Phe Asp Pro Ser Ala Phe 1010 1015 1020 Leu Val Pro Arg Asp Pro Ala Leu Cys Gln His Lys Pro Leu Thr Pro 1025 1030 1035 1040 Gln Gly Asp Glu Leu Ser Glu Pro Arg Ile Leu Ala Arg Glu Val Lys 1045 1050 1055 Lys Val Asp Ala Gln Ser Ser Ala Gly Glu Glu Asp Val Leu Leu Ser 1060 1065 1070 Lys Ser Pro Ser Ser Leu Ser Ala Asn Ile Ile Ser Ser Pro Lys Gly 1075 1080 1085 Ser Pro Ser Ser Ser Arg Lys Ser Gly Thr Ser Cys Pro Ser Ser Lys 1090 1095 1100 Asn Ser Ser Pro Asn Ser Ser Pro Arg Thr Leu Gly Arg Ser Lys Gly 1105 1110 1115 1120 Arg Leu Arg Leu Pro Gln Ile Gly Ser Lys Asn Lys Leu Ser Ser Ser 1125 1130 1135 Lys Glu Asn Leu Asp Ala Ser Lys Glu Asn Gly Ala Gly Gln Ile Cys 1140 1145 1150 Glu Leu Ala Asp Ala Leu Ser Arg Gly His Val Leu Gly Gly Ser Gln 1155 1160 1165 Pro Glu Leu Val Thr Pro Gln Asp His Glu Val Ala Leu Ala Asn Gly 1170 1175 1180 Phe Leu Tyr Glu His Glu Ala Cys Gly Asn Gly Tyr Ser Asn Gly Gln 1185 1190 1195 1200 Leu Gly Asn His Ser Glu Glu Asp Ser Thr Asp Asp Gln Arg Glu Asp 1205 1210 1215 Thr Arg Ile Lys Pro Ile Tyr Asn Leu Tyr Ala Ile Ser Cys His Ser 1220 1225 1230 Gly Ile Leu Gly Gly Gly His Tyr Val Thr Tyr Ala Lys Asn Pro Asn 1235 1240 1245 Cys Lys Trp Tyr Cys Tyr Asn Asp Ser Ser Cys Lys Glu Leu His Pro 1250 1255 1260 Asp Glu Ile Asp Thr Asp Ser Ala Tyr Ile Leu Phe Tyr Glu Gln Gln 1265 1270 1275 1280 Gly Ile Asp Tyr Ala Gln Phe Leu Pro Lys Thr Asp Gly Lys Lys Met 1285 1290 1295 Ala Asp Thr Ser Ser Met Asp Glu Asp Phe Glu Ser Asp Tyr Lys Lys 1300 1305 1310 Tyr Cys Val Leu Gln 1315 96 32 PRT Artificial Sequence consensus sequence 96 Thr Gly Leu Ile Asn Leu Gly Asn Thr Cys Tyr Met Asn Ser Val Leu 1 5 10 15 Gln Cys Leu Phe Ser Ile Pro Pro Leu Arg Asp Tyr Leu Leu Asp Ile 20 25 30 97 69 PRT Artificial Sequence consensus sequence 97 Gly Pro Gly Lys Tyr Glu Leu Tyr Ala Val Val Val His Ser Gly Ser 1 5 10 15 Ser Leu Ser Gly Gly His Tyr Thr Ala Tyr Val Lys Lys Glu Asn Trp 20 25 30 Tyr Lys Phe Asp Asp Asp Lys Val Ser Arg Val Thr Glu Glu Glu Val 35 40 45 Leu Lys Glu Ser Gly Gly Glu Ser Gly Asp Thr Ser Ser Ala Tyr Ile 50 55 60 Leu Phe Tyr Glu Arg 65 98 19 PRT Artificial Sequence exemplary motif 98 Tyr Xaa Leu Xaa Xaa Xaa Xaa Xaa His Xaa Gly Xaa Xaa Xaa Xaa Xaa 1 5 10 15 Gly His Tyr 99 15 PRT Homo sapiens 99 Leu Ser Asn Leu Gly Asn Thr Cys Phe Met Asn Ser Ser Ile Gln 1 5 10 15 100 18 PRT Homo sapiens 100 Tyr Asn Leu Tyr Ala Ile Ser Cys His Ser Gly Ile Leu Gly Gly Gly 1 5 10 15 His Tyr 101 1182 DNA Homo sapiens CDS (326)...(1006) 101 tggaggtaga aaacttttat tagcksgtcc ggttgaggcc tacagcgggg aaaggacttg 60 ccagattttc gccgcaagtc agggccatag cggggggcat aacaaggcct cccaaccgaa 120 ggtcaagcaa gagctccgag cgtcccacac aagtcccgaa gggacactgt gacgccgcgc 180 tactgaaggc gcctgggttc ccggactcgg ccaccgcctc gccgcttccg cccctcagaa 240 gcatggcggc cacgtagccc ggcccggatt ggacgttggc ggtggacgcc aaacagttgg 300 caacacgatt ggctgctgcg gggtg atg acg tca ggg ggc ggt gtc gga gtg 352 Met Thr Ser Gly Gly Gly Val Gly Val 1 5 aat ggg ggc agc atg agg ccg ggc ggc ttt ttg ggc gcc gga cag cgg 400 Asn Gly Gly Ser Met Arg Pro Gly Gly Phe Leu Gly Ala Gly Gln Arg 10 15 20 25 ctg agt aga gcc atg agc cga tgt gtt ttg gag cct cgc ccc ccg ggg 448 Leu Ser Arg Ala Met Ser Arg Cys Val Leu Glu Pro Arg Pro Pro Gly 30 35 40 aag cgg tgg atg gtg gct ggc ctg ggg aat ccc gga ctg ccc ggc acg 496 Lys Arg Trp Met Val Ala Gly Leu Gly Asn Pro Gly Leu Pro Gly Thr 45 50 55 cga cac agc gtg ggc atg gcg gtg ctg ggg cag ctg gcg cgg cgg ctg 544 Arg His Ser Val Gly Met Ala Val Leu Gly Gln Leu Ala Arg Arg Leu 60 65 70 ggt gtg gcg gag agt tgg acg cgc gac cgg cac tgt gcc gcc gac ctc 592 Gly Val Ala Glu Ser Trp Thr Arg Asp Arg His Cys Ala Ala Asp Leu 75 80 85 gcc ctg gcc ccg ctg ggg gat gcc caa ctg gtc ctg ctc cgg cca cgg 640 Ala Leu Ala Pro Leu Gly Asp Ala Gln Leu Val Leu Leu Arg Pro Arg 90 95 100 105 cgg ctt atg aac gcc aac ggg cgc agc gtg gcc cgg gct gcg gag ctg 688 Arg Leu Met Asn Ala Asn Gly Arg Ser Val Ala Arg Ala Ala Glu Leu 110 115 120 ttt ggg ctg act gcc gag gaa gtc tac ctg gtg cat gat gag ctg gac 736 Phe Gly Leu Thr Ala Glu Glu Val Tyr Leu Val His Asp Glu Leu Asp 125 130 135 aag ccc ctg ggg aga ctg gct ctg aag ctg ggg ggc agt gcc agg ggc 784 Lys Pro Leu Gly Arg Leu Ala Leu Lys Leu Gly Gly Ser Ala Arg Gly 140 145 150 cac aat gga gtc cgt tcc tgc att agc tgc ctc aac tcc aat gca atg 832 His Asn Gly Val Arg Ser Cys Ile Ser Cys Leu Asn Ser Asn Ala Met 155 160 165 cca agg ctg cgg gtg ggt atc ggg cgc ccg gcg cac cct gag gcg gtt 880 Pro Arg Leu Arg Val Gly Ile Gly Arg Pro Ala His Pro Glu Ala Val 170 175 180 185 cag gcc cat gtg ctg ggc tgc ttc tcc cct gct gag cag gag ctg ctg 928 Gln Ala His Val Leu Gly Cys Phe Ser Pro Ala Glu Gln Glu Leu Leu 190 195 200 cct ctg ttg ctg gat cga gcc acc gac ctg atc ttg gac cac atc cgt 976 Pro Leu Leu Leu Asp Arg Ala Thr Asp Leu Ile Leu Asp His Ile Arg 205 210 215 gag cga agc cag ggg ccc tca ctg ggg ccg tgacactagt ggccatggct 1026 Glu Arg Ser Gln Gly Pro Ser Leu Gly Pro 220 225 gcctgcctga ctgtagtgcc caccaaccca gccactgcca cagagctgcc acgccagcct 1086 tggtatctac tttttataca aatctcctct agactgttcc aggctgcctg cggattaaag 1146 tgggggtgac tgtgaaaaaa aaaaaaaaaa aaagga 1182 102 227 PRT Homo sapiens 102 Met Thr Ser Gly Gly Gly Val Gly Val Asn Gly Gly Ser Met Arg Pro 1 5 10 15 Gly Gly Phe Leu Gly Ala Gly Gln Arg Leu Ser Arg Ala Met Ser Arg 20 25 30 Cys Val Leu Glu Pro Arg Pro Pro Gly Lys Arg Trp Met Val Ala Gly 35 40 45 Leu Gly Asn Pro Gly Leu Pro Gly Thr Arg His Ser Val Gly Met Ala 50 55 60 Val Leu Gly Gln Leu Ala Arg Arg Leu Gly Val Ala Glu Ser Trp Thr 65 70 75 80 Arg Asp Arg His Cys Ala Ala Asp Leu Ala Leu Ala Pro Leu Gly Asp 85 90 95 Ala Gln Leu Val Leu Leu Arg Pro Arg Arg Leu Met Asn Ala Asn Gly 100 105 110 Arg Ser Val Ala Arg Ala Ala Glu Leu Phe Gly Leu Thr Ala Glu Glu 115 120 125 Val Tyr Leu Val His Asp Glu Leu Asp Lys Pro Leu Gly Arg Leu Ala 130 135 140 Leu Lys Leu Gly Gly Ser Ala Arg Gly His Asn Gly Val Arg Ser Cys 145 150 155 160 Ile Ser Cys Leu Asn Ser Asn Ala Met Pro Arg Leu Arg Val Gly Ile 165 170 175 Gly Arg Pro Ala His Pro Glu Ala Val Gln Ala His Val Leu Gly Cys 180 185 190 Phe Ser Pro Ala Glu Gln Glu Leu Leu Pro Leu Leu Leu Asp Arg Ala 195 200 205 Thr Asp Leu Ile Leu Asp His Ile Arg Glu Arg Ser Gln Gly Pro Ser 210 215 220 Leu Gly Pro 225 103 684 DNA Homo sapiens 103 atgacgtcag ggggcggtgt cggagtgaat gggggcagca tgaggccggg cggctttttg 60 ggcgccggac agcggctgag tagagccatg agccgatgtg ttttggagcc tcgccccccg 120 gggaagcggt ggatggtggc tggcctgggg aatcccggac tgcccggcac gcgacacagc 180 gtgggcatgg cggtgctggg gcagctggcg cggcggctgg gtgtggcgga gagttggacg 240 cgcgaccggc actgtgccgc cgacctcgcc ctggccccgc tgggggatgc ccaactggtc 300 ctgctccggc cacggcggct tatgaacgcc aacgggcgca gcgtggcccg ggctgcggag 360 ctgtttgggc tgactgccga ggaagtctac ctggtgcatg atgagctgga caagcccctg 420 gggagactgg ctctgaagct ggggggcagt gccaggggcc acaatggagt ccgttcctgc 480 attagctgcc tcaactccaa tgcaatgcca aggctgcggg tgggtatcgg gcgcccggcg 540 caccctgagg cggttcaggc ccatgtgctg ggctgcttct cccctgctga gcaggagctg 600 ctgcctctgt tgctggatcg agccaccgac ctgatcttgg accacatccg tgagcgaagc 660 caggggccct cactggggcc gtga 684 104 193 PRT Artificial Sequence consensus sequence 104 Thr Ile Lys Leu Ile Val Gly Leu Gly Asn Pro Gly Lys Gln Tyr Ala 1 5 10 15 Glu Thr Arg His Asn Ala Gly Phe Met Val Leu Asp Leu Leu Ala Ser 20 25 30 Arg Leu Gly Leu Ser Leu Arg Glu Glu Lys Arg Phe Phe Gly Leu Gly 35 40 45 Gly Lys Val Leu Val Ser Gly Lys Lys His Cys Val Ile Leu Leu Lys 50 55 60 Pro Arg Thr Tyr Met Asn Leu Ser Gly Lys Ala Val Leu Ala Leu Ala 65 70 75 80 Ser Phe Tyr Lys Ile Lys Pro Glu Glu Ile Leu Val Val His Asp Asp 85 90 95 Leu Asp Leu Pro Leu Gly Lys Ile Arg Leu Lys Gln Gly Gly Gly Ala 100 105 110 Gly Arg Gly His Asn Gly Leu Lys Ser Ile Ile Ser His Leu Gly Asn 115 120 125 Thr Asn Asn Phe Asn Arg Leu Arg Ile Gly Ile Gly Arg Pro Asn Pro 130 135 140 Gly Ser Asn Asp Val Ala Glu Phe Val Leu Ser Lys Phe Ser Pro Ala 145 150 155 160 Glu Arg Pro Leu Leu Glu Lys Ala Leu Asp Lys Ala Ile Glu Ala Leu 165 170 175 Glu Met Ile Ile Glu Gly His Gly Met Asn Lys Leu Met Asn Arg Phe 180 185 190 Asn 

What is claimed is:
 1. An isolated nucleic acid molecule selected from the group consisting of: a) a nucleic acid comprising the nucleotide sequence of SEQ ID NO:1, 3, 4, 6, 10, 12, 16, 18, 22, 24, 27, 29, 33, 35, 41, 43, 48, 50, 53, 55, 61, 63, 74, 76, 77, 79, 80, 82, 94, 101 or 103; and b) a nucleic acid molecule which encodes a polypeptide comprising the amino acid sequence of SEQ ID NO:2, 5, 11, 17, 23, 28, 34, 42, 49, 54, 62, 75, 78, 81, 95 or
 102. 2. The nucleic acid molecule of claim 1, further comprising vector nucleic acid sequences.
 3. The nucleic acid molecule of claim 1, further comprising nucleic acid sequences encoding a heterologous polypeptide.
 4. A host cell which contains the nucleic acid molecule of claim
 1. 5. An isolated polypeptide comprising the amino acid sequence of SEQ ID NO:2, 5, 11, 17, 23, 28, 34, 42, 49, 54, 62, 75, 78, 81, 95 or
 102. 6. The polypeptide of claim 5 further comprising heterologous amino acid sequences.
 7. An antibody or antigen-binding fragment thereof that selectively binds to a polypeptide of claim
 5. 8. A method for producing a polypeptide comprising the amino acid sequence of SEQ ID NO:2, 5, 11, 17, 23, 28, 34, 42, 49, 54, 62, 75, 78, 81, 95 or 102, the method comprising culturing the host cell of claim 4 under conditions in which the nucleic acid molecule is expressed.
 9. A method for detecting the presence of a polypeptide of claim 5 in a sample, comprising: a) contacting the sample with a compound which selectively binds to the polypeptide; and b) determining whether the compound binds to the polypeptide in the sample.
 10. The method of claim 9, wherein the compound which binds to the polypeptide is an antibody.
 11. A kit comprising a compound which selectively binds to a polypeptide of claim 5 and instructions for use.
 12. A method for detecting the presence of a nucleic acid molecule of claim 1 in a sample, comprising the steps of: a) contacting the sample with a nucleic acid probe or primer which selectively hybridizes to the nucleic acid molecule; and b) determining whether the nucleic acid probe or primer binds to a nucleic acid molecule in the sample.
 13. The method of claim 12, wherein the sample comprises mRNA molecules and is contacted with a nucleic acid probe.
 14. A kit comprising a compound which selectively hybridizes to a nucleic acid molecule of claim 1 and instructions for use.
 15. A method for identifying a compound which binds to a polypeptide of claim 5 comprising the steps of: a) contacting a polypeptide, or a cell expressing a polypeptide of claim 5 with a test compound; and b) determining whether the polypeptide binds to the test compound.
 16. A method for modulating the activity of a polypeptide of claim 5, comprising contacting a polypeptide or a cell expressing a polypeptide of claim 5 with a compound which binds to the polypeptide in a sufficient concentration to modulate the activity of the polypeptide.
 17. A method of inhibiting aberrant activity of a 26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, or 46508-expressing cell, comprising contacting a 26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, or 46508-expressing cell with a compound that modulates the activity or expression of a polypeptide of claim 5, in an amount which is effective to reduce or inhibit the aberrant activity of the cell.
 18. The method of claim 17, wherein the compound is selected from the group consisting of a peptide, a phosphopeptide, a small organic molecule, and an antibody.
 19. A method of treating or preventing a disorder characterized by aberrant activity of a 26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, or 46508-expressing cell, in a subject, comprising: administering to the subject an effective amount of a compound that modulates the activity or expression of a nucleic acid molecule of claim 1, such that the aberrant activity of the 26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, or 46508-expressing cell is reduced or inhibited. 